You’re probably thinking about adding AI to your product. Maybe you’ve already shipped something. And you’re looking at your financials wondering why your margins look terrible compared to the rest of the SaaS world.
This article is part of our comprehensive guide to startup refounding and AI-driven business model transformation, where we explore the economic realities companies face when transitioning to AI-first business models.
Here’s what’s happening. Traditional SaaS enjoys gross margins of 80-90%. Your AI-first product? You’re running at 25-60% if you’re lucky. Some companies started negative.
The difference comes down to one thing: every AI model invocation costs real money. Unlike traditional software where serving another thousand users barely moves the needle on infrastructure costs, AI burns cash with every request.
GitHub Copilot learnt this the hard way. They were losing $20-80 per user monthly while charging $10/month flat rate. That’s the kind of margin compression that kills companies. You can read the full GitHub Copilot margin case study for specific details on how they addressed this.
But there are three ways out. Infrastructure optimisation gets you routing simple requests to cheaper models. Pricing model evolution moves you from flat rates to hybrid and outcomes-based approaches. Product bundling strategies let you capture more value per customer.
This guide walks through the economics, compares the pricing approaches, and gives you frameworks for making the transition. We’ll look at what companies like Replit, Cursor, and Intercom actually did to fix their margin problems.
Why do AI-first SaaS companies have lower gross margins than traditional SaaS?
Traditional SaaS scales beautifully. Once you’ve built the software, serving another hundred thousand users costs almost nothing. Maybe some more hosting capacity. Some customer support. That’s why mature SaaS companies run at 80-90% gross margins.
AI products work differently. Each user action triggers computationally intensive operations with direct variable costs. Every model call costs money.
Think about a coding assistant. One request might trigger dozens of model calls. Understanding intent, searching documentation, generating code, checking syntax, writing tests. The features users love most burn through margins fastest.
Your COGS looks completely different. Traditional SaaS: hosting, support, minor compute. AI-first: inference costs, GPU clusters, vector databases, API fees. The list is longer and every line item is bigger.
Bessemer’s “State of AI 2025” splits AI companies into two categories. Supernovas run at roughly 25% margins with unoptimised infrastructure and experimental pricing. Shooting Stars hit 60% margins after custom models and refined pricing. The gap? Infrastructure maturity and pricing sophistication.
If you’re using OpenAI or Anthropic APIs, you’re paying per token. Every request. 84% of companies see 6%+ gross margin erosion from AI infrastructure costs.
Ben Murray from SaaS CFO puts it well: “If SaaS is about margin efficiency, AI is about value density”. You’re optimising for how much output, productivity, or labour you replace per dollar of compute.
How do outcomes-based pricing models work in AI-first companies?
Outcomes-based pricing in practice relies on what industry practitioners call “pricing on proxies”—near-outcome metrics that are easier to measure than actual business results. Companies pick measurable metrics that correlate with value, even if they don’t perfectly capture it.
Intercom demonstrates this. Their CEO claimed their Fin AI agent grew to 8-figure ARR with 393% annualised Q1 growth by tying revenue to ticket resolutions. That’s outcome-based pricing working. But notice they’re charging for resolutions, not for “improved customer satisfaction” or “reduced churn”. Resolutions are measurable. Satisfaction is squishy. To give customers flexibility, Intercom offers both credit buckets and outcome-based options, letting different segments choose what works. For more detailed examples of how companies implement outcomes-based pricing, see our case studies from companies like GitHub Copilot and Sierra.
Zapier’s Agent illustrates the measurement challenge. The same tool generates entirely different outcomes depending on use case. Customer support deflects tickets. Sales books meetings. Marketing creates content. What’s the outcome? Which one do you price on?
Attribution gets messy. External variables influence results and business outcomes become impossible to measure cleanly.
Decagon wrestles with this tension in their pricing strategy. Bihan Jiang, their Director of Product, explains: “By focusing on conversation volume rather than parsing ‘outcome,’ incentives stay clean”. Conversations are easy to count. Resolutions are harder. Most companies start with the simpler metric and migrate as measurement systems mature.
Most 2025 enterprise AI deals rely on usage-based or hybrid pricing. Pure outcome-based pricing remains rare because customers need predictability.
Joey Quirk from Chargebee calls outcomes-based pricing “usage pricing with a marketing degree”. You’re still charging for something the product does. You’ve just picked a metric closer to value.
Industry practitioners recommend a parallel approach: “Measure outcomes even when you don’t price on them”. Build dashboards, establish baselines, create feedback loops. This builds trust that sustains pricing power.
What are AI-first SaaS gross margins compared to traditional SaaS margins?
Traditional SaaS: 80-90% gross margins. AI-first early stage: roughly 25%. AI-first mature: roughly 60%.
Bessemer’s data shows LLM-native companies maintain around 65% gross margin while growing roughly 400% year-over-year. These are the companies that figured it out.
The difference? API dependency versus custom models. Flat-rate pricing versus usage-based. No caching versus intelligent routing. Poor cost visibility versus dashboards showing engineers what their code costs.
Understanding these pricing models matters because the margin differences are substantial. 92% of AI software companies now use mixed pricing models combining subscriptions with consumption fees. That shift happened fast. 41% of leading SaaS teams use hybrid models, up from 27%. This pricing evolution is a key component of successful AI transformation strategies.
Replit experienced gross margins below 10% before adopting usage-based pricing. They eventually improved to the 20-30% range. That’s real money at scale.
Companies using hybrid models report the highest median growth rate at 21%, outperforming pure subscription and pure usage-based models. The market has spoken. Hybrid wins.
44% of SaaS companies now charge for AI-powered features, unlocking new revenue streams. This is the “value bundling” play. Your traditional SaaS product runs at 80% margins. Your AI features run at 40% margins. Bundle them together, charge more, land somewhere profitable.
The trajectory is clear. You start at 25% margins with API dependency and flat pricing. You optimise infrastructure and shift to hybrid pricing. You hit 40% margins. You develop custom models and refine your pricing further. You reach 60% margins. AI-driven SaaS will likely mature towards 60-70% gross margins. Lower than legacy software but sustainable with proper cost management.
How to improve AI-first SaaS gross margins from 25% to 60%?
These margin gaps aren’t permanent. Companies follow predictable paths to close them.
The path from 25% to 60% margins follows three phases. Immediate pricing adjustments, medium-term infrastructure optimisation, long-term custom model development.
Phase 1 runs 0-6 months. Companies achieving higher margins shift from flat-rate to hybrid pricing with usage components. This is the quick win. They’re capturing some of the variable costs from customers instead of eating all of it.
Successful companies implement cost transparency dashboards for engineering teams. Developers need to see what their code costs. Most teams have no visibility. Finance sees a number in Looker once a month. No one can explain how it got there.
Phase 2 runs 6-12 months. The winning companies deploy intelligent routing. They direct 80% of simple requests to cheaper models, reserving expensive ones for complex tasks. This is the middle ground between full API dependency and custom models.
Caching strategies help. How many requests are near-duplicates? Can you cache responses for common queries? Every cache hit is a free inference.
Phase 3 runs 12-24 months. Market leaders develop custom fine-tuned models for high-volume use cases. They reduce third-party API dependency. This requires real investment but delivers 50-70% cost reduction at scale. For technical details on moving from API dependency to custom models, see our guide on infrastructure optimisation strategies.
67% of companies are actively planning to repatriate AI workloads from cloud to reduce costs. Another 19% are evaluating. 61% already run hybrid AI infrastructure mixing public and private.
Complementary tactics help. Product bundling improves mix. Value-based tier structuring ensures the best customers pay more. Credit system experimentation provides temporary scaffolding while companies refine measurement.
The improvement roadmap isn’t linear. Quick wins come from pricing. Deep margin improvement requires infrastructure investment.
How to shift from subscription pricing to outcomes-based pricing?
One of the most impactful changes in that improvement roadmap is the pricing model transition.
The successful companies don’t rip the bandaid off. They start with a hybrid model. They maintain base platform fees while adding usage or outcome components. This provides a transition path without creating a revenue cliff.
Credit systems serve as temporary scaffolding. Companies give customers predictable pools while they refine outcome measurement.
The pattern that works: pilot with new customers first. A/B test with cohorts. Extend to renewals gradually. Never force existing customers onto new pricing without grandfather clauses.
Companies build metering infrastructure. They instrument products to track proxy metrics. Conversations, resolutions, API calls. They integrate with billing platforms like Metronome, Chargebee, or Stripe Billing. The smart ones don’t build this from scratch. They leverage existing infrastructure.
The migration path looks like this: pure subscription → hybrid → usage-heavy hybrid → outcomes-based. Most companies get stuck at “usage-heavy hybrid” because it works well enough. 59% expect usage-based pricing to grow revenue share, up from 18% in 2023.
Customer communication strategy matters more than you think. One head of self-serve monetisation at a product-led SaaS company found usage stopped not because of price, but because admins didn’t trust they’d stay in budget.
A pricing strategist at an enterprise DevOps vendor puts it well: “It’s not about the unit economics. It’s about buyer confidence in total exposure.”
80% of customers report that consumption-based pricing better aligns with value they receive. But they need guardrails.
Fireflies.ai and Synthesia price by output units like meeting minutes or video minutes. This makes value tangible without exposing model complexity. Customers understand “minutes”. They don’t understand “tokens”.
Companies mitigate risk through grandfather clauses for legacy customers, spending limits to prevent bill shock, transparent usage dashboards, and phased rollouts starting with new customers.
OpenAI API dependency vs custom model development: cost comparison
The infrastructure question sits at the heart of the margin problem.
The build-versus-buy decision for AI models comes down to scale and specialisation.
Companies stay with API until reaching $50K-100K monthly inference spend, then evaluate custom development. Below that threshold, API dependency makes sense. Low upfront cost, fast implementation. You’re paying for convenience.
But that convenience has a price. Direct cost-per-use creates immediate margin compression. GPT-3.5-turbo costs $3 per million input tokens and $6 per million output tokens. Small numbers until you multiply by millions of users.
Custom model development requires investment. $100K-$500K+ for team, infrastructure, training. But you get 50-70% cost reduction at scale. That’s the payoff.
Intelligent routing sits in the middle. Companies use API for complex queries (20% of requests), fine-tuned models for simple queries (80%). Immediate cost reduction without full custom build.
Successful companies evolved this way. They started with pure API dependency, shifted to hybrid approaches with multiple model tiers. Small models for simple tasks, bigger models for complex generation.
The TCO analysis needs to factor everything. Team costs, infrastructure expenses, training compute, maintenance overhead versus ongoing API fees.
The break-even analysis changes based on usage volume. Low volume: API wins. Medium volume: intelligent routing wins. High volume: custom models win.
Technical capability requirements matter. API dependency needs minimal team. Intelligent routing needs infrastructure engineering. Custom model development needs ML expertise, GPU cluster management, training pipelines.
Decision framework: If you’re below $50K monthly inference spend, stay with API. If you’re between $50K-$200K monthly, implement intelligent routing. If you’re above $200K monthly, evaluate custom model development.
How to structure usage-based pricing for AI features?
Three common patterns emerge. Pure usage-based charging per API call. Hybrid model with base fee plus usage. Credit pools with pre-purchased consumption buckets.
Customers prefer predictable spending over exact value alignment. Hybrid models balance predictability with cost management.
Metric selection determines customer perception. Choose units that align with value. Conversations beat tokens. Resolutions beat compute time. Minutes beat API calls.
Cursor crossed $1B in ARR less than 24 months from launch with this pricing model. The hybrid approach works.
GitHub Copilot charges $19 USD per user per month (Business) or $39 USD per user per month (Enterprise). Simple per-seat pricing with usage included.
But the economics get interesting at scale. A 500-developer team using GitHub Copilot Business faces $114k in annual costs. Same team on Cursor’s business tier would pay $192k.
Tiering strategy combines base platform fees with included usage allowances, charging overages at declining rates. GitHub Copilot’s Pro+ tier offers 1,500 premium requests with $0.04 per additional request.
Pricing psychology matters. Cursor shifted to compute credit pools and triggered customer backlash due to unpredictability. Communication is everything.
Hybrid usage-based pricing breaks existing billing systems. Most companies run separate PLG and SLG stacks, neither supporting clean usage pricing.
Billing metres can be integrated with Stripe, Recurly, and Chargebee. Automated emails become handy when users approach next usage tier, get close to rate limits, or run out of credits. This prevents bill shock and builds trust.
FAQ Section
What’s the difference between outcomes-based pricing and usage-based pricing?
Usage-based pricing charges for consumption metrics like API calls, tokens, or compute time. Outcomes-based pricing charges for results delivered like resolutions, conversions, or value created. In practice, most “outcomes-based” models use “pricing on proxies” rather than true business outcomes. Near-outcome metrics like conversations completed are easier to measure than customer satisfaction.
Why does adding AI to my product hurt my gross margin?
AI introduces variable costs per user interaction through inference expenses. Unlike traditional software where marginal costs approach zero, each model invocation requires compute. If you’re using third-party APIs like OpenAI, you incur direct per-use charges that immediately compress margins.
Should I use OpenAI API or build custom models?
Stay with API until reaching $50K-100K monthly inference spend, then evaluate custom development. For immediate margin improvement without full custom build, implement intelligent routing: use APIs for complex queries (20%), fine-tuned cheaper models for simple queries (80%).
How do companies prevent margin erosion when experimenting with AI features?
Companies achieving better margins implement cost transparency dashboards for engineering teams, set spending caps on API usage, use intelligent routing to cheaper models for simple requests, and instrument products to track inference costs per customer. Many adopt hybrid pricing capturing usage costs from customers.
What are credit systems and should I use them?
Credit systems let customers pre-purchase consumption pools, providing spending predictability while companies refine outcome measurement. Credits serve as valuable transitional scaffolding for iterating teams, but durable strategies anchor to customer-understandable value drivers.
How long does it take to improve margins from 25% to 60%?
12-24 months typically, through phased approach. Immediate pricing adjustments (0-6 months), intelligent routing implementation (6-12 months), custom model development for high-volume use cases (12-24+ months). Quick wins come from pricing. Deep margin improvement requires infrastructure investment.
What pricing model do most AI SaaS companies actually use?
92% of AI software companies use mixed models combining base subscriptions with usage components. Pure subscription pricing is dropping as companies realise flat rates can’t sustain variable AI costs. Companies using hybrid models report the highest median growth rate at 21%.
How do companies measure outcomes for outcomes-based pricing?
Companies use proxy metrics that are measurable and correlate with value. Conversations completed rather than customer satisfaction. Resolutions attempted rather than business impact. Perfect outcome measurement is impractical due to attribution complexity. Proxies provide practical middle ground.
What tools do companies use to implement usage-based pricing?
Metering instrumentation in products, usage tracking databases, customer-facing consumption dashboards, spending alerts, billing platform integration like Metronome, Chargebee, or Stripe Billing. Most don’t build from scratch. They leverage existing billing infrastructure to focus on product.
How do companies communicate pricing changes to existing customers?
Companies explain cost drivers transparently, show value calculation clearly, offer grandfather clauses for legacy pricing, provide spending caps to prevent bill shock, display real-time usage dashboards, and phase rollout starting with new customers before extending to renewals.
Is it better to price per conversation or per resolution for AI agents?
Per-conversation is easier to measure and explain, but per-resolution aligns better with customer value perception. Most companies start with conversations (simpler attribution) then explore resolution-based pricing once measurement systems mature. Decagon’s analysis shows resolution pricing increases customer willingness to pay.
What gross margin should I target for my AI-first SaaS product?
Early stage: 25-40% is acceptable while optimising infrastructure and pricing. Growth stage: 40-50% target through intelligent routing and hybrid pricing. Mature: 60%+ goal via custom models and refined pricing. Traditional SaaS margins (80-90%) unlikely for pure AI products due to inherent compute costs.
Next Steps: Strategic Decision-Making
Understanding the economic realities of AI-first pricing and margins is just one piece of the puzzle. If you’re evaluating whether to pursue AI transformation, our guide on strategic decision frameworks for evaluating business model changes provides the frameworks you need to make informed choices about your company’s direction.