Business

SaaS

Technology

•

Apr 27, 2026

The Nvidia Groq Deal Explained — What Was Licensed, Why It Cost 20 Billion, and What Happens Next

In December 2025, Nvidia paid $20 billion to licence Groq’s Language Processing Unit (LPU) intellectual property. The deal was a non-exclusive IP licence that brought founder Jonathan Ross, President Sunny Madra, and the core engineering team into Nvidia.

The price was 2.9x Groq’s $6.9 billion valuation from a Series E round completed just three months earlier. That premium is the most important signal in the whole transaction.

Most coverage treats this as a talent grab, an IP play, or an antitrust manoeuvre. It’s all three at once. By the end of this article you’ll be able to explain to your board exactly what Nvidia licenced, why the deal was structured the way it was, and what it means for anyone building on GroqCloud.

This deal is the clearest single example of Nvidia’s broader hardware empire strategy in action — one move inside a much larger monopoly playbook that has been running for years.

What Did Nvidia Actually License from Groq — and What Did It Not Acquire?

The deal is a non-exclusive IP licence. Here’s what Nvidia got: Groq’s LPU chip architecture, the dataflow and static-scheduling design methodology, the compiler toolchain, and the silicon design work including the LP30 — Groq’s third-generation chip, built on Samsung 4nm.

Here’s what Nvidia did not get: Groq Inc. as a corporate entity. Groq remains legally independent. Nvidia also did not acquire GroqCloud, the inference-as-a-service platform, which continues to operate on its own.

Jensen Huang was explicit about this: “While we are adding talented employees to our ranks and licensing Groq’s IP, we are not acquiring Groq as a company.” That sentence is doing a legal job, not just a PR one — it’s the formulation that keeps the deal outside merger-review thresholds.

The “non-exclusive” clause means Groq retains the right to licence the same IP to others — AMD, Google, Intel, in theory anyone. In practice, Nvidia is the dominant beneficiary.

Founder Jonathan Ross and President Sunny Madra both transferred to Nvidia. Former CFO Simon Edwards stepped into the CEO role to lead GroqCloud. Groq Inc. still exists as a legal entity, but the people who conceived and built the LPU are now inside Nvidia. Industry analysts have been pretty direct about what this means: the deal achieves commercial equivalence to an acquisition while remaining legally outside acquisition thresholds.

Why Did Nvidia Pay $20 Billion for a $6.9 Billion Company?

The 2.9x premium is not irrational. It reflects three sources of value that compound each other.

IP value. The LPU’s dataflow architecture solves the memory-bandwidth bottleneck that limits GPU inference throughput. At GTC 2026, Jensen Huang said LPU decode should handle 25% of compute in an AI cluster. Ian Buck, Nvidia’s VP of AI and HPC, confirmed: “Integrating the LPU and LPX into our Rubin platform to optimise the decode — that’s where we’re focused right now.” The Groq IP is being embedded into Nvidia’s next product generation.

Talent scarcity. Jonathan Ross designed Google’s first Tensor Processing Unit before founding Groq. The number of engineers who can build a deterministic dataflow accelerator and bring it to commercial scale is tiny. Replacing this team organically would take years and cost more.

Competitor elimination. At $6.9 billion, Groq was approaching hyperscaler-contract scale — 2.0 million registered developers, 5.6x year-over-year growth. Nvidia holds 90–95% of the data-centre GPU market. Paying 3x a challenger’s valuation to remove it before it can leverage that growth is straightforward arithmetic. The entire $20 billion is less than one quarter of Nvidia’s data centre revenue.

What Makes the LPU Architecturally Different from a GPU — and Why Does It Matter for Inference?

On Llama 3.1 8B, GroqCloud delivered approximately 840 tokens per second as of early 2026. On Llama 3.3 70B, around 394 tokens per second. Roughly 2x faster than GPU-based alternatives at comparable price points.

The reason is architectural.

A conventional GPU follows the Von Neumann model: fetch, decode, execute, store, repeat. Dynamic scheduling resolves conflicts at runtime. Every cache miss and pipeline stall wastes cycles. For inference decode — generating output tokens — that overhead accumulates fast.

The LPU uses dataflow architecture: an “assembly line architecture” where all instruction ordering is resolved at compile time, not at runtime. The chip never idles waiting for data. The compiler pre-plans every data movement before a single token is generated; at runtime, data flows deterministically through the chip like items on a conveyor belt.

LPUs use on-chip SRAM rather than external High Bandwidth Memory. Worth clarifying: if Nvidia wanted SRAM instead of HBM, it didn’t need to buy Groq. The asset is the dataflow architecture. The SRAM is a consequence, not the prize.

The LPU handles the decode phase of inference while Nvidia’s Rubin GPUs handle prefill. They’re complementary — Vera Rubin NVL72 combined with LPX delivers up to 35x higher inference throughput per megawatt for trillion-parameter models. For a detailed look at how the LPU architecture affects inference benchmarks across GPU and ASIC alternatives, see our inference economics analysis.

The Antitrust Play: Why “Non-Exclusive Licensing” Was the Only Deal Structure That Made Sense

Nvidia is under active antitrust scrutiny in the US, EU, UK, and China. The blocked ARM acquisition (~$40 billion) and the scrutinised Mellanox deal ($6.9 billion, 2019) are the reference points Nvidia’s legal team was working against. A formal acquisition of Groq would have required filings in all four jurisdictions, with a realistic timeline of one to two years and genuine risk of being blocked.

The non-exclusive licensing structure follows a tested template. In March 2024, Microsoft paid $650 million to licence Inflection AI’s technology and hired founder Mustafa Suleyman, while Inflection nominally remained independent. The FTC opened an inquiry but took no blocking action. Nvidia used that transaction as the structural blueprint — same mechanism, same intent, same outcome. It is one step in Nvidia’s consistent playbook of neutralising challengers before they reach acquisition-triggering scale.

The “non-exclusive” framing keeps Groq legally independent, which reduces the merger argument and theoretically preserves Groq’s ability to licence to Nvidia’s competitors — giving regulators plausible deniability that competition remains. In March 2026, Senators Warren and Blumenthal sent a formal letter characterising it as a potential “reverse acquihire” and urging the DOJ and FTC to review the arrangement.

Nvidia got what it needed commercially. The legal structure was designed to make that hard to challenge.

What Happens to GroqCloud and the Developers Building on It?

GroqCloud continues to operate under new CEO Simon Edwards. The commercial fundamentals are solid: 2.0 million registered developers, 5.6x year-over-year growth, competitive pricing (Llama 3.1 8B at $0.05/M input tokens; Llama 3.3 70B at $0.59/M input tokens).

The engineers who built the LPU hardware stack — Jonathan Ross and Sunny Madra — are now at Nvidia. The technical leadership has transferred. What that means for the hardware roadmap is the key open question: the LP30 is being brought to market as part of Nvidia’s Vera Rubin LPX platform in 2026, but whether it will also be available for GroqCloud infrastructure or integrated exclusively into Nvidia’s AI Factory is currently unanswered.

The API is live, performance is unchanged, and there’s no stated end-of-service timeline. The practical approach: keep using GroqCloud where it delivers value, but don’t build deep lock-in to any single inference provider. Both Groq and Cerebras offer OpenAI-compatible APIs — migration is a matter of changing environment variables. That optionality is worth preserving regardless of how the LP30 situation resolves.

Cerebras, SambaNova, and the Remaining Inference Challengers — What the Groq Deal Signals for Them

With the core Groq team now inside Nvidia, the inference-specialist chip space has lost its highest-profile independent voice. The two primary remaining challengers are Cerebras Systems and SambaNova Systems.

Cerebras Systems is the most technically differentiated remaining option. The WSE-3 is wafer-scale — the chip is the size of a dinner plate — with more than 40 GB of on-chip SRAM. Performance: approximately 2,200 tokens per second on Llama 3.1 8B versus Groq’s 840, at around $0.10/M input versus Groq’s $0.05/M. Roughly 3x faster at roughly 2x the price. For workloads where decode latency is the primary constraint, Cerebras is the most technically comparable alternative.

SambaNova Systems is consistently cited alongside Cerebras as an inference specialist, but limited public benchmark data is available. Commercial products exist; detailed independent comparisons are sparse.

The deal sends two signals both companies should read clearly.

Validation: Nvidia would not pay $20 billion for inference-specialist architecture if inference ASICs were irrelevant. This confirms the thesis. Analysts project specialised inference ASICs could capture 45% of total AI accelerator revenue by 2030 — Nvidia is making sure it captures that segment, not independents.

Precedent: NextPlatform after GTC 2026 was direct: “I think there is a very good chance Cerebras will get one [a similar offer], too.” The inference-ASIC market now operates under the explicit precedent that Nvidia will act when an architecture reaches sufficient traction.

Both Cerebras and SambaNova are operational. Both carry the same acquisition-risk overhang that Groq carried before December 2025.

What the Groq Deal Tells Us About Nvidia’s Next Moves

The deal establishes a template: identify the most credible inference-specialist challenger, wait until it has enough traction to validate the IP, then acquire the IP and talent via non-exclusive licensing before it reaches acquisition-triggering scale.

Jensen Huang’s GTC 2026 keynote confirmed this is an offensive integration strategy, not a defensive one. Nvidia has been extending platform control beyond chips for years — Mellanox added InfiniBand (2020), Run:ai added GPU orchestration (2024), SchedMD added Slurm (2025). The Groq deal adds inference decode architecture. Each step extends Nvidia’s influence to the systems that decide which hardware runs what jobs.

What this means for your AI infrastructure decisions in 2026: Nvidia is not just defending GPU dominance. It’s absorbing the architectures that might have competed with it and integrating them as platform features. The remaining independents — Cerebras, SambaNova, NextSilicon — exist. But they operate under the precedent that scale and technical credibility trigger an offer, not a defence against one.

For the full account of how this fits into Nvidia’s long-term dominance, see the full monopoly playbook. For the procurement implications — specifically what the Groq deal means for your GPU infrastructure decisions — see our GPU procurement decision framework.

FAQ: The Nvidia Groq Deal

Is GroqCloud shutting down after the Nvidia deal?

No. GroqCloud keeps running under new CEO Simon Edwards. The platform has 2.0 million registered developers and solid commercial fundamentals. That said, the founding team has transferred to Nvidia, and the long-term hardware roadmap is uncertain — specifically whether the LP30 chip will serve GroqCloud infrastructure or get integrated exclusively into Nvidia’s AI Factory.

What is a Language Processing Unit (LPU) and how does it differ from a GPU?

An LPU is a deterministic AI inference accelerator. It uses dataflow architecture and static scheduling to eliminate the memory-access bottlenecks that slow GPUs down. The LPU resolves all instruction ordering at compile time and uses on-chip SRAM — delivering roughly 840 tokens per second on Llama 3.1 8B, approximately 2x faster than GPU alternatives at comparable price points.

What is “non-exclusive licensing” and why did Nvidia choose this structure?

Non-exclusive licensing means Nvidia uses Groq’s IP without preventing Groq from licensing it to others. Nvidia chose this structure to avoid triggering antitrust merger-review filings in the US, EU, UK, and China. Following the Microsoft-Inflection precedent from March 2024, it achieves commercial equivalence to an acquisition without the regulatory exposure.

Who is Jonathan Ross and why does his background matter?

Jonathan Ross founded Groq and designed Google’s first Tensor Processing Unit (TPU) — one of a very small number of engineers who have built a custom AI accelerator from scratch and brought it to commercial scale. His transfer to Nvidia is the talent-acquisition dimension of the deal. Nvidia acquired not just blueprints but the team that builds the next generation.

What is “static scheduling” in the context of the LPU?

Static scheduling means Groq’s compiler resolves all instruction ordering before runtime. A conventional GPU works it out dynamically at runtime. The LPU executes a pre-planned sequence with no wasted cycles waiting for data.

What is dataflow architecture and why did Nvidia want it?

Dataflow architecture streams data through processing units sequentially rather than following the Von Neumann cycle. It eliminates the memory bottlenecks that limit GPU inference. Nvidia wanted it because Jensen Huang stated LPU decode should handle 25% of compute in an AI cluster — integrating it into Vera Rubin extends inference capability without redesigning the GPU.

What is the LP30 chip and will it be available outside Nvidia’s platform?

The LP30 is Groq’s third-generation LPU, built on Samsung 4nm. In Nvidia’s rack-scale configuration (NVIDIA Groq 3 LPX), 256 chips deliver 315 PFLOPS, 128 GB total SRAM, and 40 PB/s bandwidth. Whether it will also serve GroqCloud is the most important open question for GroqCloud developers right now.

Is Groq still an independent company after the Nvidia deal?

Legally, yes — Groq Inc. has its own CEO, investors, and commercial product. But its founding team is at Nvidia, Nvidia holds the core IP licence, and future hardware depends on architecture Nvidia is integrating into its own platform. Legal independence: real. Commercial independence: substantially diminished.

What are the best alternatives to Groq for AI inference now?

The most comparable option for low-latency decode is Cerebras Systems (WSE-3, ~2,200 tokens/second on Llama 3.1 8B, roughly 2x Groq’s pricing). SambaNova Systems has commercial products but sparse public benchmark data. Both offer OpenAI-compatible APIs — migration is mostly a matter of changing environment variables. Both also carry the same acquisition-risk overhang that Groq carried before December 2025.

Why did Groq raise $750M at $6.9B just three months before the $20B deal?

The Series E funded LP30 development and GroqCloud growth. Nvidia moved before the LP30 could reach market and give Groq true hyperscaler leverage. The raise established the valuation baseline; Nvidia paid 2.9x to eliminate the competitive threat before it crystallised. Sequential events, not contradictory ones.

Could Groq licence its LPU IP to AMD or Google?

Theoretically, yes — the non-exclusive clause preserves that right. In practice, Groq’s engineering leadership is at Nvidia, LP30 development is tied to Nvidia’s AI Factory roadmap, and Groq’s survival depends on GroqCloud revenue. The clause matters most as a regulatory instrument.

What is the Microsoft-Inflection deal and how does it relate to the Nvidia-Groq deal?

In March 2024, Microsoft licenced Inflection AI’s technology and hired founder Mustafa Suleyman without formally acquiring the company. The FTC looked but did not block it. Nvidia used this as the structural template — same mechanism, same intent, same outcome. It’s the most direct evidence the deal’s legal structure was deliberate regulatory engineering.