Business

SaaS

Technology

•

Mar 23, 2026

How AI Coding Tools Have Changed the Economics of Building vs Buying Software

For most of your career, the answer to “build or buy?” has been obvious. Building a meaningful SaaS MVP costs $500K–$1M, takes 6–12 months, and needs a team. Buying was almost always the rational call.

That assumption is now wrong for a growing category of tools.

AI coding tools — Cursor, Claude Code, GitHub Copilot — have collapsed the cost of building a functional SaaS alternative to near-zero in tooling cost over roughly the past 18 months. The build vs. buy ROI calculation looks entirely different when development time shrinks from months to days. This article is part of our comprehensive SaaS Reckoning guide, which covers the full landscape of AI’s impact on enterprise software.

For a growing category of point tools, the question is no longer “can we afford to build it?” It is “can we afford to keep paying for it?”

This article covers what has changed, what the evidence shows, and a practical framework for when building now makes sense. Shopify, Duolingo, and Cursor are corporate-scale evidence — not proof-of-concept demos.

What has actually changed about software development economics in the past 18 months?

Let’s be specific about the numbers. Previously: $500K–$1M, a team of 5–10 engineers, 6–12 months. That cost structure made buying almost always rational. The SaaS vendor had amortised those costs across thousands of customers — you could not compete on price.

AppDirect reached more than 90% AI-generated code in one year, reporting a 70% reduction in development costs and a 5x output multiplier. In-house teams building five times more applications than traditional IT teams could support. At Google, over 30% of new code is now generated with AI assistance. These are operational realities at scaled organisations — not experiments.

When build cost drops from $500K to $5K in engineering time, tools that were obviously “buy” become rational “build” candidates. As Joe Reis put it: “That ‘build versus buy’ spreadsheet you’ve been using? It’s increasingly obsolete, so factor in what AI can generate, how quickly, and at what cost.”

What are AI coding tools actually doing — and why is “vibe coding” not just hype?

Vibe coding — Andrej Karpathy‘s term for AI-assisted development via natural language prompts — is the name that has stuck. The name undersells the tools.

Cursor, Claude Code, GitHub Copilot, and Codex are full development environments. They handle architecture, testing, refactoring, documentation, and debugging — not just autocomplete. The workflow is Plan → Code → Review: build a feature specification with the LLM, generate code via an agentic loop, then review. It is a genuinely different process from writing code manually. (The same agentic loop dynamic is what makes AI agents disruptive to SaaS business models at a mechanism level — worth understanding before committing to a build or buy position.)

AppDirect reached 90% AI-generated code in one year: speed increased, onboarding improved, quality held. The work shifted from typing syntax to architecture, assumptions, and product thinking.

AI-generated code carries real security and maintenance risks if your review process is weak. The productivity gain is documented. So is the discipline requirement. The question is not whether the tools are ready — it is whether your team has the discipline to use them well.

What is compound engineering — and what does it mean when one developer can ship five products?

Compound engineering — referenced in a16z’s analysis of the AI software development stack — describes a single developer using AI coding agents to maintain and ship multiple products simultaneously. AppDirect’s CEO puts the ratio at 5x. Engineers who adopt AI tools at OpenAI open 70% more pull requests than their peers.

“We don’t have the engineering capacity to build and maintain it” was a legitimate objection when an internal tool required a dedicated team. For narrow-function tools, that objection no longer holds in the same way. One developer with AI tooling can own what previously required three to five.

What compound engineering does not mean: it is not a case for eliminating headcount. Your team-sizing cost models were built for pre-AI output rates. Those need updating.

What have companies actually built — and how fast?

The most credible evidence operates at scale — not in startups with nothing to lose.

Shopify’s AI-first policy: Tobi Lütke’s mandate requires developers to demonstrate they cannot use AI tools before requesting additional headcount. AI use is a baseline performance expectation — built into hiring and performance evaluation, not a productivity suggestion.

Duolingo’s 148 languages: Duolingo launched 148 new language courses in under a year, more than doubling its existing content. Under traditional development this would have required years and a much larger team — the most concrete output-per-developer data point in the public record.

Cursor as dual evidence: Cursor reached $500M ARR and approximately $10B valuation within 15 months — a company built using its own AI-coding paradigm, adopted at a rate that signals category-level acceptance in professional engineering teams. For a broader analysis of how AI-native startups are challenging SaaS incumbents, Cursor is one of many examples across the application layer.

The Shopify and Duolingo examples are the ones to anchor to: established organisations with mature engineering practices and real customers.

What does it now cost to build an alternative to a SaaS tool — the $0 MVP in practice?

The $0 MVP concept refers to near-zero tooling cost, not zero cost. Cursor costs under $100 per developer per month. Claude Code and GitHub Copilot are similarly priced. The remaining cost is engineer-hours — and that number has dropped significantly.

Ben Yoskovitz at Highline Beta built a full contacts tool in approximately 2 days with Lovable, replacing a CRM subscription for 30–40 users. A basic CRM dashboard — previously 3–4 months at $150K–$200K — can now be built in 1–2 developer weeks.

What the $0 MVP does not include: ongoing maintenance, security review, infrastructure costs, and developer time to iterate. As the Focused Chaos analysis notes: “You have to maintain the software. Users will want more features. They’ll report bugs and want support.” There is also the institutional knowledge risk — if the people who built it leave, you are left with an orphaned application.

When annual SaaS cost exceeds the cost to build and maintain internally, building is now economically rational. The SaaS CFO frames this well: customers do not even need to build an alternative to change renewal negotiations — they just need to credibly threaten it. The broader SaaS reckoning framework explains why this negotiating leverage has shifted so decisively toward buyers.

When does it make sense to build — and when should you still buy?

The answer in 2026 is not “always build.” Klarna replaced Salesforce CRM with an internally developed AI system, customer satisfaction declined, and they reversed course. Targeted replacement of narrow-function point tools is the right model.

Four criteria determine the build case:

1. Workflow type: Point tools with narrow, well-defined functionality are now prime build candidates. Systems of record — payroll, ERP, compliance data, financial ledgers — are still firmly “buy.” AppDirect’s CTO Andy Sen is direct: “For anything where the results have to be one hundred percent predictable, there’s no room for hallucinations. Financial software, core billing, systems of record for medical information — those aren’t going to be replaced by AI anytime soon.”

2. Regulatory exposure: Any workflow touching regulated data — healthcare, financial reporting, payments, HR compliance — should remain “buy.” Validation overhead negates build cost savings for most mid-market teams.

3. Team capacity for maintenance: Compound engineering enables one developer to own more, but it requires that developer to exist and stay. If the team cannot absorb ongoing maintenance, the economics shift back toward buying.

4. Competitive differentiation potential: If the functionality is a source of competitive advantage, building gives you roadmap ownership. If it is commodity workflow management, the differentiation value is zero.

The practical trigger: SaaS tool costs more per year than four weeks of a developer’s fully-loaded time, it is not in a regulated category, scope is narrow, and your team has maintenance capacity — the build case is worth running.

The full portfolio audit and vendor renegotiation playbook is covered in Auditing and Rebuilding Your SaaS Stack in the Age of AI.

What do AI coding economics mean for your engineering team?

The compound engineering model changes what is valuable in engineering work, not just how fast it gets done.

The developer who can specify system behaviour precisely, review AI output critically, and make sound architectural decisions is more valuable than the developer who writes boilerplate quickly. Language polyglot skills are becoming less important — system design, architectural judgement, code review discipline, and domain expertise are the differentiators now.

Shopify’s AI-first policy made this explicit: AI tool proficiency is built into hiring criteria and performance evaluation. “Can you work effectively with AI tools?” is the baseline expectation.

If your team is already using AI coding tools, the maintenance overhead of building internal tools is lower than your cost model suggests. A16z’s framing: staff for judgement and ownership, not headcount-to-output ratios.

The Refactoring FM survey of 340 engineering teams found 44% have no dedicated time for AI experimentation. The output gap is already observable. Shopify’s policy and Duolingo’s output metrics are not future projections.

For the complete picture — from understanding the SaaS market shift through to auditing your stack and renegotiating vendor contracts — see the full SaaS Reckoning guide.

FAQ

What is vibe coding and is it production-ready?

Vibe coding — popularised by Andrej Karpathy in early 2025 — means writing software specifications in natural language and having AI generate the code. It is production-ready when used with proper code review and security discipline. AppDirect runs 90%+ AI-generated code in production with a 70% reduction in development costs. The risk is weak review processes that introduce problems expensive to fix later.

How much does it actually cost to build a SaaS alternative with AI coding tools?

Tooling cost is near-zero — Cursor, Claude Code, and GitHub Copilot each cost under $100 per developer per month. The real cost is engineer-hours: a point tool with narrow scope can be built in 1–2 developer weeks. Ongoing maintenance is what most calculations underestimate — budget 10–20% of initial build time per year.

Which AI coding tool gives the best results — Cursor, Claude Code, or GitHub Copilot?

They serve different use cases. Cursor is an AI-first code editor for greenfield development. Claude Code handles complex reasoning and architecture. GitHub Copilot integrates tightly with existing IDE workflows. Most engineering teams use more than one — they are complementary, not competing choices.

What is compound engineering?

One developer using AI coding agents can maintain and ship multiple software products simultaneously — an output ratio that previously required teams of five to ten. AppDirect’s CEO puts the ratio at 5x. AI handles implementation; the developer focuses on architecture, specification, and review.

Is vibe coding safe for regulated industries like healthcare or finance?

The validation overhead changes the economics significantly. In regulated contexts — HIPAA, PCI-DSS, SOX — every piece of code requires security review, audit trail documentation, and compliance testing. Time saved in writing is partially offset by validation requirements. For most mid-market teams, payroll, financial ledgers, and healthcare data platforms remain “buy.”

What is Shopify’s AI-first policy and what does it actually require of developers?

Tobi Lütke’s mandate: developers must demonstrate they cannot accomplish a task with AI tools before requesting additional headcount. AI use is a baseline performance expectation built into hiring and evaluation. For other engineering leaders, this is a policy precedent worth watching.

Should I cancel SaaS subscriptions and build everything with AI tools?

No. Klarna replaced Salesforce CRM with an internally built AI system, customer satisfaction declined, and they reversed course. Targeted replacement of point tools where the economics favour building is the rational approach. ERP, payroll, financial systems of record, and tools where vendor support is load-bearing should remain “buy.”

What categories of SaaS tools are most rational to replace with AI-built alternatives?

Build candidates: reporting dashboards on your own data, simple internal portals, basic workflow automation, lightweight CRM-adjacent tools, data pipeline tools, and simple ticketing systems. Still buy: ERP, payroll, financial and healthcare data platforms, billing and subscription management, identity and access management, and security tooling. The principle: system of engagement, narrow scope, no regulated data — realistic build candidate.

How do I know when the economics of building actually beat buying?

When annual SaaS subscription cost exceeds (initial build cost + annual maintenance + infrastructure cost). A rough heuristic: if the tool costs more than four weeks of a developer’s fully-loaded time, is a point solution outside regulated categories, and your team has capacity to maintain it — the build case is worth running the numbers.

What does the compound engineering model mean for how I should size my engineering team?

A team of five using AI tools may now outship a traditional team of ten to fifteen. This does not mean reducing headcount — it means hiring for judgement (system design, architecture, domain expertise, review discipline) rather than implementation throughput. Re-run your headcount projections with compound engineering assumptions before finalising the number.

Why is Cursor both a tool and an evidence point?

Cursor reached $500M ARR and approximately $10B valuation within 15 months — a company built using its own AI-coding paradigm, growing at a rate that signals category-level acceptance in professional engineering teams.

What is the difference between building with AI tools and traditional custom software development?

Traditional: developer writes code manually, cost front-loaded in engineering time, maintenance permanent. AI-assisted: developer specifies behaviour, AI generates implementation, developer reviews and iterates, cost reduced by 50–70%+ depending on task. The key difference: AI-generated code still requires maintenance — the ongoing cost is lower, but not zero, and your team must retain the skill to understand and modify what was built.

How AI Coding Tools Have Changed the Economics of Building vs Buying Software

What has actually changed about software development economics in the past 18 months?

What are AI coding tools actually doing — and why is “vibe coding” not just hype?

What is compound engineering — and what does it mean when one developer can ship five products?

What have companies actually built — and how fast?

What does it now cost to build an alternative to a SaaS tool — the $0 MVP in practice?

When does it make sense to build — and when should you still buy?

What do AI coding economics mean for your engineering team?

FAQ

What is vibe coding and is it production-ready?

How much does it actually cost to build a SaaS alternative with AI coding tools?

Which AI coding tool gives the best results — Cursor, Claude Code, or GitHub Copilot?

What is compound engineering?

Is vibe coding safe for regulated industries like healthcare or finance?

What is Shopify’s AI-first policy and what does it actually require of developers?

Should I cancel SaaS subscriptions and build everything with AI tools?

What categories of SaaS tools are most rational to replace with AI-built alternatives?

How do I know when the economics of building actually beat buying?

What does the compound engineering model mean for how I should size my engineering team?

Why is Cursor both a tool and an evidence point?

What is the difference between building with AI tools and traditional custom software development?

Related Articles

After the wireframes – the rules at the heart of your app

Freelancers vs team extension – why freelancers always lose

5 Platforms For Optimising Your Agents Compared

Need a reliable team to help achieve your software goals?

BUSINESS HOURS

SYDNEY

YOGYAKARTA

BANDUNG