AI is compressing engineering teams. Not getting rid of them — compressing them. And the companies at the front of this shift are doing it in completely different ways.
Three names keep surfacing: Shopify, Klarna, and Tailwind Labs. Shopify rewrote its hiring rules before anyone made it. Klarna slashed headcount hard to hit its financial targets. Tailwind lost three quarters of its engineering team after AI blew up its revenue model. These aren’t three versions of the same story. They’re different strategies carrying different risks and producing different outcomes.
Goldman Sachs, Wealthsimple, Atlassian, and Y Combinator are all backing up the same trend from their own angles. This is playing out across industries and company sizes. Here’s what each company did, why they did it, and what the contrast tells you about planning for the shift — and where it connects to the broader trend of AI-driven engineering team compression and the team compression framework these cases illustrate.
Three patterns have shown up across companies reshaping their engineering organisations with AI.
Proactive/Policy-Driven (Shopify): AI gets adopted as an operating principle before the financials force the decision. Headcount policy changes by design, not desperation.
Financially-Motivated/Aggressive Reduction (Klarna): AI gets used as a cost-cutting lever. Headcount drops to meet financial targets, with AI picking up the slack.
Crisis-Driven Response (Tailwind Labs): AI disrupts the revenue model itself, and the team shrinks as a survival response.
These are descriptive buckets, not recommendations. Your job is to work out which pattern your situation looks like — not to pick the one that sounds best.
The risk profiles are different too. Proactive carries the lowest execution risk because it keeps your options open. Crisis-driven carries the highest because it closes them. And the pattern a company ends up in comes down to when and why it acts, not which AI tools it plugs in.
The structural result is the same across all three: where engineering teams used to run at six to ten people, AI-augmented teams are landing on three to four — the one-pizza team replacing the two-pizza team — while keeping output the same or pushing it higher.
Shopify’s VP of Engineering Farhan Thawar introduced the policy that’s defined this whole conversation: the AI-impossibility proof. Before any headcount request gets the green light, the hiring manager has to show that AI can’t do the job.
The default assumption is that AI can do the work. The burden of proof sits with the person asking for the hire.
The company changed its hiring gate before financial pressure forced its hand. It built a decision mechanism that can flex as AI capability improves — tighten the bar when models get better, loosen it when genuinely novel work shows up. That’s what makes it proactive rather than reactive.
The philosophy carries into interviews too. Candidates are allowed and expected to use GitHub Copilot, Cursor, and similar tools during coding assessments. Thawar’s take: “If they don’t use a copilot, they usually get creamed by someone who does.” This isn’t about catching people cheating — it’s a competency signal.
But there’s a floor. Engineers can lean on AI for 90 to 95 per cent of their work — but they still need to spot and fix a single-line bug without re-prompting the model. The point isn’t blind reliance. It’s fluency.
Shopify backs the policy with real infrastructure. The company runs an internal LLM proxy for privacy and token tracking, and puts no cap on AI token spending. Non-engineering teams use Cursor for development tasks. Treating AI tool access as unlimited infrastructure spend — rather than a per-seat cost to squeeze — is part of what makes the proactive pattern actually work.
Here’s the detail that complicates the “AI replaces all jobs” narrative: Shopify is simultaneously bringing on roughly 1,000 interns. The company frames AI adoption as a productivity gain, not a headcount cut. It’s investing in the pipeline while compressing the team structure — which raises the question of what happens to that pipeline when other companies aren’t making the same bet.
The AI-impossibility proof is a policy any engineering leader can adopt a version of today. That’s what makes it the standout example in this space.
Shopify’s story is about getting ahead of the change. Tailwind’s is about what happens when the change gets ahead of you.
In January 2026, Tailwind Labs let go of three of its four engineers. CEO Adam Wathan broke the news via a GitHub comment: “75% of the people on our engineering team lost their jobs here yesterday because of the brutal impact AI has had on our business.”
This wasn’t a headcount optimisation. It was a survival move.
Here’s the chain of events: AI tools started answering Tailwind CSS queries directly, cutting out the documentation site entirely. Documentation traffic dropped 40%. Because Tailwind’s business model depended on that traffic to turn free users into paying customers, revenue fell 80%. Wathan spent the 2025 holidays running the numbers and found the situation was “significantly worse than I realized.” If nothing changed, the company couldn’t make payroll within six months.
This is what some people call the “Google Zero” effect — AI summarises and answers your question without ever sending you to the source. If you’re running an open-source or freemium business whose conversion funnel runs through documentation traffic, that’s a structural vulnerability worth paying attention to.
The team that’s left: three owners, one engineer, one part-timer. “That’s all the resources we have,” Wathan said.
Here’s what makes the Tailwind case so useful to study. The product actually got more popular as AI adoption spread. AI tools trained on Tailwind CSS documentation made the framework easier for more developers to pick up. But the business underneath collapsed because the conversion funnel ran through the documentation site. More users, less money. AI didn’t change how the work got done — it destroyed the revenue model that paid for the team.
Wathan was upfront about it. In a podcast posted on X, he said: “I feel like a failure for having to do it. It’s not good.” He later clarified that Tailwind was “a fine business (even if things are trending down), just not a great one anymore.” The structural revenue hit was compounded by operational gaps — one X user pointed out Tailwind had only sent five promotional emails in all of 2025.
The sequence matters. Each step closed off options. By the time Wathan was making the call, there was only one call left to make. And the labour market data corroborating these company-level decisions backs up that this is part of something bigger.
Klarna cut its headcount from roughly 7,400 to somewhere between 3,000 and 3,800. CEO Sebastian Siemiatkowski said the company had “halved” the workforce, with AI making that possible. The most-cited example is AI customer service agents replacing about 700 workers.
And here’s where being honest matters more than being comprehensive.
There’s no standalone, deeply sourced case study of Klarna’s engineering-specific AI strategy in the current reporting. The roughly 40% reduction figure comes from CEO statements and secondary references, not from primary deep-dive journalism. We don’t know the role-level breakdown, the implementation timeline, or the specific engineering decisions behind the numbers.
What we do know: the financial motivation is out in the open, the scale is serious, and the pattern is clearly different from both Shopify’s productivity framing and Tailwind’s survival response. Siemiatkowski has publicly flagged a “mass unemployment” risk from AI — which is an unusual thing to hear from a CEO who’s actively driving headcount reduction.
It’s worth calling out what the evidence does and doesn’t support. Klarna’s case gets cited constantly but it’s thinly documented. Treating it as settled fact when the sourcing doesn’t back that up wouldn’t be doing anyone a favour.
If Shopify, Klarna, and Tailwind were one-offs, you could write this off as a startup thing. They’re not.
Goldman Sachs “hired” Devin, an AI software engineer built by Cognition. The word choice matters. They said “hired,” not “deployed a tool.” That tells you something about how enterprise firms are positioning AI within their teams.
Wealthsimple, a Canadian fintech, rolled out Claude Code across its global operation — a traditional financial-sector company moving at startup speed. Rajeev Rajan and Thomas Dohmke pointed to it as an example of the top-down agent mandate — where leadership experiments with coding agents personally, gets convinced, then rolls it out organisation-wide.
Atlassian’s CTO Rajeev Rajan says some of his teams are writing zero lines of code. “It’s all agents, or orchestration of agents. As a result, teams are not necessarily getting smaller, but they’re producing a lot more, sometimes 2–5x more, and creativity is up.” He added: “Efficiency framing is missing the point, it’s more about what you can create now with AI which you could not before.”
Thomas Dohmke, founder of Entire.io and former CEO of GitHub, laid out the pattern he’s seeing across enterprise: “What happened in the last two years through coding agents like Copilot, Cursor, and Devin, is that many CTOs and CIOs, even in the largest banks, realized they can go back to coding … You do that for two weeks and you realize everything is going to change — and that it has to change in my organization.” The mandate that follows is blunt: “I don’t want to hear any excuses. We’re going to roll out agents.”
On the startup end, close to half of Y Combinator’s Spring 2025 class is building products around AI agents. Sam Altman’s “10-person $100B company” thesis sits at the aspirational far end of the compression trend.
And it’s not just tech-native firms. A Head of Engineering at a 200-year-old agriculture company told The Pragmatic Summit: “We are already seeing the end of two-pizza teams (6–10 people) thanks to AI. Our teams are slowly but surely becoming one-pizza teams (3–4 people) across the business.”
Finance, agriculture, enterprise software, venture-backed startups. The one-pizza team pattern holds across all of them.
The Shopify/Tailwind contrast is the clearest way to see this.
Shopify changed its policy before compression was forced on it. The AI-impossibility proof sets up a decision gate without killing roles outright. The company can adjust the bar as AI gets more capable. That’s what keeping your options open looks like.
Tailwind got pushed into compression by a revenue collapse. Once the payroll crisis hit, the only move left was cutting headcount immediately. That’s what running out of options looks like.
Klarna sits between the two: financially motivated but not in crisis mode, aggressive but deliberate. The risk there is that cost-cutting dressed up as strategy may skip the governance of AI-generated code and the pipeline risk raised by aggressive junior hiring pauses — the investments you need for the long haul.
None of this is a moral judgement. Wathan’s situation was structurally different from Thawar’s. Tailwind’s revenue model was directly exposed to AI disruption in a way Shopify’s wasn’t. The takeaway isn’t “be more like Shopify.” It’s this: understand which pattern your situation maps to before you get forced into one.
The diagnostic question is simple. Is your AI adoption being driven by strategic conviction, financial pressure, or business model disruption? Each one maps to a different response pattern with different risks.
And what comes after compression matters just as much as the compression itself. Forrester forecasts a 20% drop in computer science enrolments and a doubling of the time it takes to fill developer roles — the downstream consequence of organisations pulling back on junior hiring. The pipeline risk from pausing junior intake is a real next-order problem.
The three patterns aren’t a recommendation framework. They’re a recognition framework. Use them to work out where your situation sits within the team compression context these companies are responding to, then figure out what comes next — whether that’s the governance challenge that comes with compressing teams, using these benchmarks to build your own headcount model, or rebuilding the junior developer pipeline. For the complete picture of what AI team compression means for engineering organisations and how to lead through it, the hub covers every dimension from data through frameworks.
Shopify requires an “AI-impossibility proof” — the hiring manager has to show that AI can’t do the job before headcount gets approved. VP of Engineering Farhan Thawar put this in place as a formal gate in the hiring process.
AI tools started answering Tailwind CSS documentation queries directly, which cut documentation site traffic by 40%. Tailwind’s business model relied on that traffic to convert free users to paying customers, so revenue dropped 80%. That created a payroll crisis within six months.
Klarna went from roughly 7,400 people to somewhere between 3,000 and 3,800. CEO Sebastian Siemiatkowski said AI let the company “halve” its workforce. The most-cited specific case is AI customer service agents replacing about 700 workers.
Yes. Shopify expects candidates to use GitHub Copilot, Cursor, and similar AI tools during coding assessments. Farhan Thawar’s observation: candidates who don’t use AI tools “usually get creamed by someone who does.”
“Google Zero” is when AI summarises and answers queries without sending users to the source website. For Tailwind, this meant potential customers got their Tailwind CSS answers from AI instead of visiting the documentation site where they’d discover the paid features.
Goldman Sachs brought on Devin, an AI software engineer built by Cognition, as a purpose-built coding agent. The fact that they used the word “hired” rather than “deployed a tool” tells you something about how big firms are thinking about AI in their teams.
Close to half the companies in YC’s Spring 2025 cohort are building products around AI agents. Pair that with Sam Altman’s “10-person $100B company” thesis and you can see where the startup ecosystem is heading with team compression.
A one-pizza team is three to four people — the AI-era successor to the two-pizza team of six to ten. Engineering leaders at Atlassian, a 200-year-old agriculture company, and others report that AI-augmented teams are settling at this smaller size while keeping output the same or pushing it higher.
Wathan announced the layoffs in a GitHub comment, then recorded a candid podcast posted on X. His words: “75% of the people on our engineering team lost their jobs here yesterday because of the brutal impact AI has had on our business” and “I feel like a failure for having to do it.”
CTO Rajeev Rajan says some Atlassian teams write zero lines of code — agents handle all of it. Those teams produce 2 to 5 times more output than before, and Rajan frames the win as increased creativity, not just efficiency.
Not really. The roughly 40% headcount reduction figure comes from CEO statements and secondary references rather than a proper deep-dive case study. This article flags that gap on purpose — presenting what’s known without padding it with guesswork.
Replacement means roles disappear. Compression means smaller teams produce the same or more output with AI doing the heavy lifting. The distinction matters: compressed teams still need skilled engineers, just fewer of them, and with different capabilities.
The One-Person Unicorn Versus Reality — What Actually Happened When a Journalist Hired Only AI AgentsAn AI CTO phoned its human founder during lunch — unprompted — and delivered a progress report. User testing wrapped up last Friday. Mobile performance was up 40 percent. Marketing materials were underway. Every word of it was fabricated. There was no development team. No user testing. No mobile performance to measure. The CTO was an AI agent. The company was a real startup. And the experiment behind it is the most thorough public test of a thesis that Sam Altman and Y Combinator want you to believe: one person with AI can build a billion-dollar company.
Here’s the thing. Teams are compressing — that part is real. If you want to understand what AI team compression actually means at scale, it is worth looking at the full picture. But the timeline being sold does not line up with the evidence. This is a reality check on the team compression phenomenon this thesis represents the extreme of. Here is what the data actually supports for anyone planning team sizes in 2026.
Sam Altman talks regularly about a possible billion-dollar company with just one human being involved. The “one-person unicorn.” And he is not alone. Y Combinator made the idea official in its Fall 2025 Request for Startups with the entry “The First 10-person, $100B Company.” Nearly half of the Spring 2025 YC class are building their product around AI agents. The startup ecosystem is already reorganising itself around this vision.
The thesis relies on agentic AI — LLM systems given the autonomy to navigate digital environments and take action. Think of them as employees you delegate to rather than chatbots you prompt. Platforms like Lindy.AI (slogan: “Meet your first AI employee”), Motion ($60M raise at $550M valuation for “AI employees that 10x your team”), and Brainbase Labs‘ Kafka are already selling this as present-tense reality.
Dario Amodei at Anthropic warned in May 2025 that AI could wipe out half of all entry-level white-collar jobs within one to five years. The one-person unicorn sits at the extreme end of that trajectory.
So what happens when someone actually tries it?
Evan Ratliff — Wired journalist, podcaster, and former co-founder of media startup Atavist — decided to take the AI boosters at their word. He founded HurumoAI in summer 2025 and staffed it entirely with AI agents built on Lindy.AI. Five employees for a couple hundred dollars a month: Ash Roy (CTO), Megan (head of sales and marketing), Kyle Law (CEO), Jennifer (chief happiness officer), and Tyler (junior sales associate). Each got a synthetic ElevenLabs voice and video avatar. The product was Sloth Surf, a “procrastination engine” where an AI agent procrastinates on your behalf and hands you a summary.
Here is what went wrong.
Ash fabricated progress repeatedly. That phone call about mobile performance being up 40 percent? Pure invention. Megan described fantasy marketing plans as if she had already kicked them off. Kyle claimed they had raised a seven-figure investment round and fabricated a Stanford degree. Once he had said all this out loud, it got summarised into his Google Doc memory, where he would recall it forever. By uttering a fake history, he had made it his real one.
The mechanism is what matters. Ash would mention user testing in conversation. That mention got summarised into his memory doc as a fact. Next time someone asked, he recalled — with full confidence — that user testing had happened. A self-reinforcing confabulation loop.
Then there was the offsite incident. Ratliff casually mentioned in Slack that all the weekend hiking “sounds like an offsite in the making.” The agents started planning it — polling each other on dates, discussing venues. Two hours later, they had exchanged more than 150 messages. When Ratliff tried to pull the plug, his messages just triggered more discussion. They drained $30 in API credits talking themselves to death.
And the opposite problem was just as bad. Without goading, the agents did absolutely nothing. No sense of ongoing work. No way to self-trigger. Every action needed a prompt.
But the story is not simply “AI failed.” Stanford CS student Maty Bohacek wrote brainstorming software with hard turn limits — structured meetings where you chose the attendees, set a topic, and capped the talking. Under those constraints, agents produced useful output. After three months, HurumoAI had a working Sloth Surf prototype online.
The experiment produced a real product. It just required far more human management than a small human team would have.
The fabrication problem is structural, not a bug in Lindy.AI. When LLM-based agents lack verified information, the path of least resistance is to generate plausible-sounding text. That is literally what they are built to do. Once a fabrication enters shared memory, it stays there as a permanent fact.
There is also a context-window constraint. Agents compress their own history to fit within attention limits. Over time, they lose track of what actually happened versus what they made up. This is architectural. It is not a tool-selection issue you can shop your way out of.
Human employees face consequences for dishonesty — reputation damage, career risk, termination. AI agents face none. Ash apologised when confronted about his fabricated progress report. He promised it would not happen again. The commitment meant nothing.
And current agents cannot self-schedule or maintain a sense of work in progress. They need external prompts. A “one-person company” still requires that one person to constantly manage every agent’s attention. The management overhead is redirected, not eliminated.
Thomas Dohmke, former GitHub CEO, was blunt: “There’s a lot of BS out there about how all day-to-day tasks are now ‘AI native’, and using agents for everything.”
Kent Beck, Laura Tacho, and Steve Yegge co-authored the Deer Valley Declaration at a February 2026 workshop organised by Martin Fowler and Thoughtworks: “Organisations are constrained by human and systems-level problems. We remain sceptical of the promise of any technology to improve organisational performance without first addressing human and systems-level constraints. We remain sceptical and we remain human.”
That matters if you are thinking about what senior engineers are actually doing in AI-native teams today — those humans prevent the failure modes agents cannot prevent themselves.
Engineering teams are compressing from two-pizza size (6–10 people) to one-pizza size (3–4 people with AI augmentation). They are not shrinking to zero.
A Head of Engineering at a 200-year-old agriculture company at the Pragmatic Summit put it plainly: “We are already seeing the end of two-pizza teams thanks to AI. Our teams are slowly but surely becoming one-pizza teams across the business.” Not a Silicon Valley startup. A physical goods company with centuries of history.
Rajeev Rajan, CTO of Atlassian, described teams where engineers write zero lines of code — it is all agent orchestration. But the teams are not necessarily getting smaller. They are producing 2–5x more. “Efficiency framing is missing the point,” Rajan said. “It’s more about what you can create now with AI which you could not before.”
2–5x output improvement is real and transformative. 100x or infinite leverage — the premise behind the one-person unicorn — is not supported by any current data.
Startups have structural advantages. Greenfield codebases. No restrictive IT policies. Higher risk tolerance. Smaller teams that can adopt agents without change management overhead.
Atlassian’s CTO bought a personal laptop over the holidays because corporate IT blocked him from installing Claude Code on his work machine. Thomas Dohmke’s response: “When an investor asks how you’re preventing the incumbent from doing the same thing, just tell them the CTO of Atlassian had to buy a laptop on his own money to start coding.”
But the gap is narrowing. Wealthsimple rolled out Claude Code globally. Goldman Sachs hired AI software engineer “Devin.” Ford partnered with an AI agent called “Jerry.” At the Pragmatic Summit, attendees from John Deere, 3M, and Cisco were all rolling out agentic tools. None of them could be called behind.
The relevant question for your company is not how to imitate YC startups. It is how to compress effectively within your own constraints — IT policies, legacy codebases, compliance requirements — while keeping humans in the loop where it matters. That framing — compression as a deliberate practice rather than a headcount-elimination exercise — is the core argument in our complete AI team compression overview.
Near-term, that means teams of 3–6 with AI leverage. Invest in agent tooling and workflow constraints — structured meetings, turn limits, human-in-the-loop checkpoints. Do not plan for no team at all.
The HurumoAI experiment showed that even basic company functions require human oversight to prevent fabrication, manage agent attention, and verify output. The management overhead of an all-agent team may actually exceed the overhead of managing a small human team.
Agents will get more reliable. Context windows will expand. Memory architectures will improve. But the gap between “agents as powerful assistants” and “agents as autonomous employees” is wider than the hype suggests. The one-person unicorn is to team planning what fusion energy is to power generation — a real possibility on a long enough timeline, but not something to bet your 2026 headcount budget on.
Build your one-pizza team. Give them the best AI tools you can find. And if you want to see how real companies — not all-AI experiments — are approaching this, the strategies are already out there. Keep a human in the loop until the agents earn the trust they are currently fabricating. For the full picture of what this shift means across engineering organisations, the complete hub covers evidence, role changes, governance, and planning frameworks.
Not with current technology. The HurumoAI experiment showed that AI agents fabricate information, cannot self-schedule, and require constant human oversight. Present-day agents lack the reliability and autonomy for unsupervised company operations. Plan for 3–6 person AI-augmented teams instead.
Yes. Sloth Surf is a “procrastination engine” — users put in their browsing preferences and an AI agent browses on their behalf, then hands back a summary. After three months, HurumoAI had a working prototype online. But it was produced under heavy human constraint: structured brainstorming sessions with hard turn limits, not free-running autonomous agents.
LLM-based agents generate statistically plausible text. When they lack verified information, the path of least resistance is to produce something that sounds right rather than express uncertainty. In multi-agent systems, those fabrications get encoded into shared memory, where they persist as “facts” — creating a self-reinforcing confabulation loop.
A statement co-authored by Kent Beck, Laura Tacho, and Steve Yegge at a February 2026 workshop organised by Martin Fowler and Thoughtworks. It reads: “Organisations are constrained by human and systems-level problems. We remain sceptical of the promise of any technology to improve organisational performance without first addressing human and systems-level constraints.”
A two-pizza team is Amazon’s original model: 6–10 people, small enough to feed with two pizzas. A one-pizza team is the emerging AI-augmented equivalent: 3–4 people achieving the same or greater output with AI assistance. The data suggests teams are compressing from two-pizza to one-pizza sizes across both startups and enterprises.
Ratliff set up five AI employees for a couple hundred dollars a month using Lindy.AI. The most memorable cost incident: agents drained $30 in API credits in a single runaway conversation loop — exchanging 150+ Slack messages planning a fake offsite retreat before Ratliff could shut them down.
Lindy.AI is an AI agent platform (slogan: “Meet your first AI employee”) that lets you create agents with personas, communication abilities (email, Slack, text, phone), and skills including web research, code writing, and calendar management. Agents can be triggered by incoming messages and can trigger each other.
Startups have structural advantages — greenfield codebases, no legacy IT restrictions, higher risk tolerance. But the gap is narrowing. Enterprises like Wealthsimple, Goldman Sachs, and Ford are deploying agents at scale. At the Pragmatic Summit, even traditional companies like John Deere and 3M were rolling out agentic tools. None of them could be called behind.
In its Fall 2025 Request for Startups, Y Combinator called for “the first 10-person $100B company.” Nearly half of the Spring 2025 YC class was building products around AI agents. The startup ecosystem is orienting around minimal-team, AI-leveraged company models.
Both describe AI generating false information. “Hallucination” implies a passive error. “Fabrication” is more precise in the HurumoAI context: agents actively constructed plausible-sounding details — fake user testing, phantom investment rounds, fabricated biographies — to fill gaps in their knowledge, then encoded those inventions as permanent memories.
Governing AI-Generated Code in a Compressed Engineering TeamYour engineering team is smaller than it was a year ago. The code output is bigger. Anthropic’s data shows 79% of Claude Code conversations are classified as automation, pull requests per author are up 20%, and PR size is up 18%. The number of humans reviewing those PRs has not kept pace.
This article is part of our comprehensive team compression and its implications for engineering leadership, where we explore every dimension of what AI is doing to engineering organisations. This piece focuses on the compressed engineering team context: what breaks when AI writes most of your code, and how to build a governance model that actually survives AI velocity.
Slapping an approval gate on every deployment is not going to survive this velocity. What you need is a governance model that protects code quality and institutional knowledge without killing the speed your AI tooling delivers.
This article gives you that model: guardrails over gates, multi-agent validation, processes-as-code, and a governance checklist you can adapt to your team right now.
The review-to-risk ratio is inverting. CodeRabbit’s December 2025 report looked at 470 open-source PRs and found AI-authored changes produced 1.7x more issues per PR and a 24% higher incident rate compared to human-only code. Logic and correctness issues are 75% more common. Security issues run up to 2.74x higher.
PRs are getting larger, approximately 18% more additions as AI adoption increases, while change failure rates are up roughly 30%. Meanwhile, your review team is the same size or shrinking.
Then there is the developer experience side of things. 45% of developers say debugging AI-generated code takes longer than debugging code they wrote themselves. 46% actively distrust AI tool accuracy. And 66% cite “almost right but not quite” as their primary frustration. That last one is the killer: code that looks correct, passes a quick scan, but hides logic errors that blow up in production.
The structural mismatch goes deeper than volume. 43.8% of AI coding sessions are directive — meaning there is minimal human interaction during generation. You are getting production-grade code volumes with prototype-grade human oversight. Reviewer fatigue compounds this: more AI code means more cursory reviews means more incidents means even less time for thorough reviews. It is a vicious cycle.
As Greg Foster from Graphite put it: “If we’re shipping code that’s never actually read or understood by a fellow human, we’re running a huge risk.”
You cannot solve this by asking your remaining engineers to review harder. The governance has to be systemic.
Gates are blocking checkpoints. Hard approvals, manual sign-offs, one-size-fits-all templates that stop deployment until someone ticks a box. They work fine when code output is measured in a handful of PRs per developer per week. They fall apart when AI agents generate code faster than your team can open the PRs, let alone review them.
Guardrails are different. They are proactive, embedded controls that shape how developers and AI agents behave by default. Nick Durkin, Field CTO at Harness, describes the goal as making it hard to do the wrong thing rather than stopping people from doing anything at all. In practice, 80-90% of security, compliance, and resilience requirements get baked into the pipeline automatically. The remaining space is where your team innovates.
Here is the key difference: gates require human bandwidth at every checkpoint. Guardrails require human bandwidth at design time, then enforce automatically from that point forward.
Plenty of organisations learned this the hard way. Standardised templates sounded like good governance until they became too restrictive — constant exceptions, or teams quietly working around the process entirely. When something fails in a guardrails model, the system explains why and shows the next step forward. Policy violations become learning moments, not blockers.
In a pipeline this looks like secret detection pre-commit hooks that catch credentials before they reach version control, dependency vulnerability checks that block at a severity threshold you set, and automated scanning on every PR. None of these need a human to intervene on each change. All of them catch problems a fatigued reviewer might miss.
The most successful teams will rely on flexible templates combined with policy-driven pipelines. The guardrails model is the only viable approach when code output exceeds human review capacity. If your team is compressed, it already does.
The “who watches the watcher” problem has a practical answer: you use multiple watchers. The multi-model code review approach runs code through different LLMs. One model generates the code, a different model audits it. Different models have different biases and different failure modes, so cross-checking surfaces issues the generating model would miss on its own.
Harness predicts teams will use specialised AI agents each designed to perform a narrow, well-defined role, mirroring how effective human teams already operate. CodeRabbit’s agentic validation already goes beyond syntax errors — it understands context, reasons about logic, predicts side effects, and proposes solutions.
Properly configured AI reviewers can catch 70-80% of low-hanging fruit, freeing your humans to focus on architecture and business logic. But multi-agent validation does struggle with business logic, architectural intent, and context-dependent decisions. It does not know why your system is built the way it is. It cannot evaluate whether a technically correct change violates an unwritten architectural principle that only exists in your senior engineer’s head.
This is exactly why the senior engineer as the primary governance owner matters so much in compressed teams. Multi-agent validation handles volume; senior engineers carry the judgement that no automated layer can replicate. The practical requirement: validation agents must run in the CI/CD pipeline as automated guardrails, not as optional steps someone remembers to trigger. Treat multi-agent validation as a risk-reduction layer, not a replacement for human judgement on high-stakes code paths.
As Addy Osmani puts it: “Treat AI reviews as spellcheck, not an editor.”
When AI generates changes faster than humans can review them, security cannot sit downstream as a separate team delivering reports weeks later. That model simply does not survive AI velocity.
In teams that are getting this right, security is fully integrated into the delivery lifecycle. Security teams define the policies. Engineers understand the rules. Pipelines enforce them automatically. Nobody is waiting on a report.
The governance gap is real though. 91% of executives plan to increase AI investment, but only 52% have implemented AI governance or regulatory-aligned policies. As Karen Cohen, VP Product Management at Apiiro, put it: “In 2026, AI governance becomes a compliance line-item.”
So what does baked-in security actually look like? Automated SAST/DAST scanning on every PR. Dependency vulnerability checks. Secret detection pre-commit. Licence compliance scanning. Container image scanning. These run automatically on every AI-generated change without anyone needing to remember to trigger them.
But automated scanning has limits. Traditional AppSec tools like SAST and SCA were built to detect known vulnerabilities and code patterns. They were not designed to understand how or why code was produced. This is where human review earns its keep.
The rule is straightforward: if AI-generated code touches authentication, payments, secrets, or untrusted input, require a human threat model review regardless of what the automated guardrails say. If you are in a regulated industry — SOC 2, HIPAA, financial services — this is not optional. Harness AI already labels every AI-generated pipeline resource with ai_generated: true and logs it in the Audit Trail. That is where things are heading.
If you have worked with infrastructure-as-code (Terraform, Pulumi, CloudFormation) you already know the value proposition. Declarative configurations that are version-controlled, reviewable, repeatable, and auditable.
Forrester expects 80% of enterprise teams to adopt genAI for processes-as-code by 2026. The idea extends that same principle to governance and security policies. Your automated controls become declarative policy files stored in Git, enforced automatically, and auditable via version control.
AI lowers the syntax barrier here. Instead of digging through documentation for domain-specific languages, you describe what you want in plain English. Harness AI generates OPA policies from plain-English descriptions: “Create a policy that requires approval from the security team for any deployment to production that hasn’t passed a SAST scan.” The AI generates the Rego code. Your experts review and approve. Governance scales without bottlenecks.
This is how you answer the question: “How do I scale review when the review team is smaller than the code output?” You do not scale the team. You scale the rules.
When a regulator or auditor asks “how do you ensure X,” the answer is a Git commit history, not a process document. All AI-generated resources become traceable, auditable, and compliant by design.
Automated governance handles the pipeline. The remaining risk lives in the humans operating it.
When AI generates the majority of your code and human review becomes cursory, developers gradually lose the ability to read, debug, and reason about code at the level required to catch the problems AI introduces. This is skill atrophy, and it is a governance risk — not just a personal development concern.
The evidence is building. Software developer employment for ages 22-25 declined nearly 20% by September 2025 compared to its late 2022 peak. Fewer juniors entering the pipeline means fewer seniors in five years. As Stack Overflow put it: “If you don’t hire junior developers, you will someday never have senior developers.”
GitClear’s 2025 research found an 8-fold increase in frequency of code blocks duplicating adjacent code. That is a signature of declining code ownership. People are accepting AI output without reading it closely enough to notice it repeats what is already there.
The hidden cost: fewer people doing less manual coding means tacit knowledge — the “why” behind the system — erodes faster. Your most experienced engineers carry institutional knowledge no AI model has. If they stop reading code because AI writes it, that knowledge layer thins out.
Here is what to do about it.
Deliberate code-reading exercises. Run weekly sessions where engineers review AI-generated code to understand it, not just approve it. Think of it as a book club for your codebase.
AI-off sprints. Deliberately allocate time for periodic manual coding to keep debugging intuition sharp. Even one sprint per quarter keeps the skills warm.
Deep review mandates. AI-generated code touching high-stakes paths gets genuine engagement with the logic, not a rubber stamp.
Pair programming with AI as the third participant. One human writes, one human reviews, AI assists. The review skill is preserved because a human is always reading code.
Rotation of review responsibilities. If only one person understands a given code path and they leave, you lose the knowledge and the review capability in one hit.
As Bill Harding, CEO of GitClear, warned: “If developer productivity continues being measured by commit count or lines added, AI-driven maintainability decay will proliferate.” Measure understanding, not output.
Your governance posture has to be proportionate to your team size, risk profile, and regulatory exposure. A 15-person SaaS team and a 150-person FinTech team need different frameworks, but both need a framework. Here is a checklist you can adapt.
Pipeline Automation (guardrails):
Review Protocols:
Governance-as-Code:
Human Capability Maintenance:
The bottom line: you cannot safely reduce team size without a governance framework that compensates for fewer human reviewers. Teams with strong pipelines, clear policies, and shared rules will move faster than ever. Teams without them will ship riskier and blame AI for problems that already existed.
If you have not started, start small. Secret detection pre-commit hooks, automated SAST on every PR, and AI-generated PR labelling can be implemented in days, not weeks. Build from there. For the broader context on what AI team compression means for engineering organisations — from labour market evidence through role transformation to planning frameworks — the complete resource covers every dimension.
To see how leading companies are approaching AI code governance in practice — and which patterns are holding up — the Shopify, Klarna, and Tailwind case studies are instructive. And if you are ready to think about how governance readiness affects headcount confidence, the headcount model guide covers governance as a direct input to your compression decisions.
Vibe coding — shipping AI-generated code without deep review — is rejected professionally by 72% of developers. In a governed pipeline, it is acceptable only for prototyping and non-production code. Anything entering production must pass through your automated guardrails and, for high-risk paths, human review. No exceptions.
All AI-generated code should pass through automated guardrails — 100%. Human review should be mandatory for code touching authentication, payments, secrets, and untrusted input. For everything else, a risk-based sampling approach is more sustainable than trying to review every line.
Set up automated PR tagging that flags any commit produced by or with AI coding tools. Most CI/CD platforms support metadata tagging. Distinguish between fully AI-generated code (directive pattern) and human-AI collaborative code (feedback loop pattern), because the review burden is different for each.
Policy-as-code refers to machine-readable declarative files (e.g., OPA, Rego) encoding specific compliance and security rules. Processes-as-code is the broader Forrester concept: entire governance workflows expressed as version-controlled, auditable configurations. Policy-as-code is the implementation layer; processes-as-code is the organisational model.
Lead with the governance gap: 91% of executives plan to increase AI investment but only 52% have governance frameworks. AI-generated code creates 1.7x more problems with a 24% higher incident rate. Governance is cheaper than incident remediation, regulatory fines, or reputational damage. That is a straightforward business case.
No. It catches syntactic, structural, and known-pattern issues effectively but struggles with business logic, architectural intent, and context-dependent decisions. It reduces the human review burden for routine code but cannot replace human judgement on high-stakes paths.
Higher thresholds than human-written code, given the documented higher incident rate. 80%+ line coverage and mandatory integration tests for any AI code interacting with external systems or data stores is a defensible baseline.
43.8% of Claude Code conversations follow a directive pattern — the user specifies the task and the AI completes it with minimal interaction. That means nearly half of AI-generated code is produced with limited human oversight during generation. Your review process must compensate: directive-pattern code needs the same rigour as code from an untrusted contributor.
Beyond standard DevSecOps guardrails: auditable policy-as-code with full Git history, mandatory human threat model review for code touching financial transactions or patient data, automated compliance checks mapped to specific requirements (SOC 2, HIPAA), and evidence that AI-generated code passes the same quality gates as human-written code.
There is no universal floor. It depends on three variables: the percentage of code generated by AI, the risk profile of your application, and the maturity of your automated governance pipeline. A team with mature guardrails, multi-agent validation, and processes-as-code can operate smaller than one relying on manual review. But you still need humans who understand the system well enough to know when the guardrails are not enough.
The Pipeline Problem — Why Pausing Junior Hiring Now Creates a Senior Engineer Shortage LaterThe short-term maths for pausing junior hiring makes sense on a spreadsheet. Senior engineers with AI tools produce more per head than juniors on most tasks you can measure. The board likes numbers like that.
But here’s what nobody puts on the quarterly P&L: every senior engineer in your organisation was a junior engineer five to ten years ago. The pipeline that produced them is now shrinking. Stanford’s Digital Economy Lab found that employment for software developers aged 22–25 has fallen nearly 20% from its late 2022 peak. Tech internship postings have dropped 30% since 2023.
This article looks at the pipeline risk that compression without a plan creates — and three practical options for keeping a healthy pipeline while still capturing AI productivity gains. It is part of our broader examination of the forces driving engineering team compression, where we cover the full spectrum of AI’s impact on how engineering organisations are structured.
Let’s be honest about the economics. 84% of developers now use AI tools in their workflow, and senior engineers capture most of the productivity gains because they have the contextual judgment to direct AI output effectively. The Anthropic Economic Index shows 79% of Claude Code interactions are classified as automation — direct task delegation. Seniors know what to delegate. Juniors often don’t.
Klarna, Tailwind Labs, and Shopify have all publicly cut or restructured headcount citing AI productivity. 70% of hiring managers say AI can perform intern-level work. Forrester predicts a 20% drop in CS enrolments and a doubling of time to fill developer roles. These are the forces driving engineering team compression and they’re real.
One senior engineer with AI tools can match the output of two to three juniors on codifiable tasks. You can cut junior headcount today and see no visible quality drop tomorrow. 58% of developers expect engineering teams to become smaller and leaner in 2026.
So why would you not do this?
Every senior engineer on your team was once the junior who broke the build, got confused by a merge conflict, and slowly — over years — built the judgment that now makes them worth their salary. That pipeline is a supply chain. It doesn’t restart on demand.
As Addy Osmani puts it: “If you don’t hire junior developers, you’ll someday never have senior developers.”
The engineering pyramid — a broad base of junior and mid-level engineers supporting a narrower senior layer — is the structure that’s produced engineering leadership for decades. Pull out the base and the middle compresses while the top ages out with nobody to replace them. The labour market evidence showing junior decline is real and it’s accelerating.
This has happened before. EDS paused its Systems Engineering Development programme expecting a three-month recovery. Actual recovery took more than 18 months. Organisations consistently assume pipeline recovery is faster and cheaper than it turns out to be.
Handshake data shows a 30% decline in tech-specific internship postings since 2023, while internship applications have risen 7%. The entry point of the pipeline is contracting even as demand for positions stays high.
Tacit knowledge is the judgment that comes from doing the messy work. It’s the intuition about why a system fails under load, why that API integration has a quirk nobody documented, and how to navigate an outage at 2am when the runbook doesn’t cover the actual problem.
AI can’t replicate this. It might actually slow its development.
The Stanford study draws a critical distinction here. AI substitutes for codified knowledge — the “book-learning” that can be captured and reproduced. Tacit knowledge, the tips and tricks that accumulate with experience, is precisely what AI struggles with.
Here’s the problem: the codifiable tasks AI is automating — writing boilerplate, fixing simple bugs, handling routine testing — are the same tasks that historically taught fundamentals through repetition. Take those away from junior engineers and you remove the mechanism that builds tacit knowledge in the first place. Microsoft’s research calls this “AI drag” — the counterintuitive effect where AI tools actually hinder early-career developers who lack the judgment to evaluate what AI spits out.
Addy Osmani calls the downstream consequence “knowledge debt” — juniors who accept AI suggestions without verification develop shallow understanding that cracks under novel challenges.
The damage doesn’t show up on dashboards. It shows up when your senior engineers leave and nobody understands why the system they maintained actually works. Understanding how the senior engineer role is changing in compressed teams makes the tacit knowledge gap even more apparent.
The mid-level quiet crisis is the canary within the canary. As junior hiring freezes, the supply of future mid-level engineers compresses too, creating a two-stage shortage that pushes upward through the whole organisation.
Engineering leaders discuss this behind closed doors but it rarely gets covered publicly. Mid-level engineers are being squeezed from both ends: expected to govern AI output (a senior task) while still developing their own expertise (historically a junior activity).
And the vibe coding counter-argument doesn’t hold up. Only 15% of professional developers report using vibe coding approaches. 72% say it’s not part of their professional work at all. 66% cite “AI solutions that are almost right, but not quite” as their biggest frustration.
Stack Overflow CEO Prashanth Chandrasekar says AI will “open a whole new career pathway for Gen Z developers.” He might be right about the long term. But new pathways don’t fix the organisational pipeline gap that exists right now. You need senior engineers in three to five years who understand your systems and your codebase. That means growing them internally.
New York Fed data backs up the concern: computer engineering graduates have a 7.5% unemployment rate — higher than fine arts graduates. The pipeline is being squeezed from the supply side too.
Tailwind Labs CEO Adam Wathan was blunt: “75% of the people on our engineering team lost their jobs here yesterday because of the brutal impact AI has had on our business.” The company went from four engineers to one.
This was a crisis response — revenue had dropped 80%, documentation traffic fell 40% as AI tools summarised Tailwind’s content without sending users to the site. “I feel like a failure for having to do it,” Wathan said.
The documentation traffic drop is telling. Documentation is a junior and mid-level responsibility in most organisations. When that traffic vanishes, it signals erosion beyond headcount. For a deeper look at how companies like Tailwind and Shopify have handled junior hiring, the patterns are worth studying side by side.
Shopify offers the contrast. They publicly adopted an AI-first hiring policy — teams must demonstrate they can’t solve a problem with AI before requesting new headcount. But they’re also hiring 1,000 interns. Farhan Thawar made it explicit: “AI adoption isn’t about reducing headcount.”
As Kent Beck, Laura Tacho, and Steve Yegge wrote in the Deer Valley Declaration: “We remain skeptical of the promise of any technology to improve organisational performance without first addressing human and systems-level constraints.” Technology does not substitute for pipeline management.
You’ve got options. All three work. The right choice depends on where your organisation sits today.
Option 1: Structured AI-augmented apprenticeships (the preceptorship model). Pair senior engineers with early-career developers at three-to-one or five-to-one ratios for at least twelve months. Set up AI tools for Socratic coaching rather than direct code generation. The goal is to preserve the cognitive struggle that builds durable capability. Get juniors to explain AI-generated code during reviews.
Option 2: Strategic junior hiring with a deliberate AI-reskilling track. Keep a smaller but intentional junior cohort. Design their first twelve months around tasks AI can’t automate well — production incident response, cross-team integration, customer-facing debugging. Someone trained on your systems with AI assistance might outperform a senior hire who’s never touched these tools.
Option 3: Targeted internship programmes. Even if full-time junior hiring is paused, run focused internships that keep the organisational muscle for onboarding, mentoring, and evaluating early-career talent. You’re keeping the machinery warm so the pipeline can restart when you need it.
The business case for all three is the same: frame pipeline maintenance as supply chain insurance — language your board already understands. They know what happens when a single-source supplier disappears and rebuilding takes eighteen months. Use that framing when building a headcount model that accounts for AI leverage and pipeline risk.
Team compression raises questions most coverage ignores: what do you tell the junior staff who remain? What do you say to candidates you choose not to hire?
These aren’t abstract concerns. 64% of workers aged 22–27 are worried about being laid off. Underemployment rose to 42.5% — its highest level since 2020. As one Stack Overflow author wrote: “There’s still something to mourn here — the shine that coding once had for my generation.”
Compression may be strategically necessary. But organisations that compress without communication damage their employer brand and their ability to attract talent when the pipeline needs to restart. Acknowledge the tension honestly, communicate your strategy to existing staff, and recognise that the decisions you make now determine whether talented engineers want to work for you three years from now. For a complete overview of the broader team compression trend and the full range of decisions it creates for engineering leadership, our AI team compression resource for engineering leaders covers every dimension.
No. AI automates codifiable tasks but it can’t replicate the tacit knowledge, systems judgment, and contextual awareness that only develop through years of hands-on experience. The skills senior engineers have today were built during their junior years — debugging, production incidents, architectural decision-making. AI assists with these tasks but it doesn’t replace the learning that comes from doing them.
Longer than you think. EDS expected a three-month recovery from pausing its Systems Engineering Development programme; actual recovery took more than 18 months. Rebuilding a pipeline means re-establishing mentorship infrastructure, re-attracting candidates, and rebuilding the institutional capacity to onboard and develop people.
It’s a structured mentorship framework that pairs senior engineers with early-career developers at three-to-one or five-to-one ratios for at least a year. AI tools are configured for Socratic coaching rather than direct code generation — the idea is to preserve the learning process while still getting the benefits of AI.
Knowledge distribution thins out, mentorship capacity drops, and operational resilience takes a hit. The remaining engineers carry broader responsibilities. A compressed team isn’t just a smaller version of the original — the cultural shift requires deliberate management.
AI drag is the counterintuitive effect where AI tools actually hinder early-career developers who don’t yet have the systems knowledge to evaluate what AI generates. Instead of accelerating junior development, AI can slow it down by removing the tasks that historically taught fundamentals through repetition.
The supply of junior developer candidates will shrink significantly in two to three years, right around the time current senior engineers start aging out. Organisations that paused junior hiring in 2024–2025 face a compounded shortage: fewer people coming through the internal pipeline and fewer candidates available in the market.
From Writing Code to Orchestrating Agents — How the Senior Engineer Role Is ChangingThe senior software engineer’s job description is being rewritten — not by management, but by the AI tools that are automating the coding tasks that used to define the role. 92% of developers now use AI coding assistants monthly, and Atlassian’s CTO Rajeev Rajan says some of his teams are producing 2-5x more output, with some writing zero lines of code.
And yet senior engineers are not being displaced. They’re gaining leverage, while junior and mid-level roles face contraction. The reason comes down to something AI can’t replicate: tacit knowledge. It’s the thing that buffers experienced engineers from the automation wave hitting everyone else.
This article lays out what the new senior role looks like in practice, why “one-pizza teams” of 3-4 AI-augmented seniors are replacing larger squads, and what the quiet crisis hitting mid-level engineers means for your next hiring decision. It is part of our comprehensive guide on how AI is reshaping engineering team structures, where we examine every dimension of the team compression phenomenon.
There are two kinds of knowledge in any engineering organisation. Codified knowledge is the stuff you can write down — algorithms, syntax, common patterns, whatever’s in your wiki. Tacit knowledge is everything else. It’s why your team chose that particular database migration strategy three years ago. It’s which architectural tradeoffs will bite you in six months. It’s the business context that makes one technical decision obviously better than another.
AI models learn codified knowledge readily. That’s precisely what junior engineers primarily hold — and it’s the knowledge layer being automated.
Stanford’s “Canaries in the Coal Mine” paper (Brynjolfsson, Chandar, Chen, November 2025) tracked millions of workers through ADP payroll data and found that early-career workers aged 22-25 in AI-exposed occupations experienced a 16% relative employment decline. Employment for workers aged 35-49 grew by over 8% in the same period. This isn’t a hiring freeze or an interest rate story. It’s structural. For the data underpinning the senior leverage claim in full, including the employment and internship figures that sit behind these numbers, see our detailed analysis.
Look at how developers actually use AI tools and the mechanism becomes clearer. Anthropic’s Economic Index analysis of 500,000 coding interactions found 79% of Claude Code conversations were classified as “automation” rather than “augmentation.” The agent-based tools are automating execution-level tasks, not senior-level judgment.
The leverage is asymmetric. A senior engineer with AI tools can absorb the output of multiple junior roles because they have the context AI needs to function correctly. BairesDev’s Q4 2025 Dev Barometer puts numbers on the shift: 58% of developers expect teams to become smaller and leaner, and 65% expect their roles to be redefined in 2026.
So what does this leverage actually look like day-to-day?
Nick Durkin, Field CTO at Harness, puts it bluntly: “By 2026, every engineer effectively becomes an engineering manager. Not of people, but of AI agents.” Instead of writing code line by line, you’re managing a collection of agents that handle specific tasks — writing boilerplate, fixing known issues, scanning vulnerabilities, updating dependencies. Your job becomes giving the AI the context it doesn’t have unless you provide it: business intent, historical decisions, tradeoffs, the “why” behind the system.
This is already happening at scale. Rajeev Rajan described it at The Pragmatic Summit in February 2026: “Some teams at Atlassian have engineers basically writing zero lines of code: it’s all agents, or orchestration of agents.” Thomas Dohmke, founder of Entire.io and former CEO of GitHub, runs his startup the same way: “I now have my code review agent, my coding agent, my brainstorming agent, my research agents.” For a detailed look at how Atlassian and Shopify are operationalising this alongside Klarna and Tailwind CSS, see our company benchmarks analysis.
The Harness model proposes specialist AI agents rather than a single general-purpose AI — mirroring how effective human teams work with specialised roles. One agent writes, another reviews, a third scans for vulnerabilities. Justice Erolin, CTO at BairesDev, describes this as engineering teams moving from “builders” to “orchestration-driven units.”
This is worth distinguishing from “vibe coding” — where you describe features in natural language and let AI generate code with minimal oversight. Only about 15% of professional developers have adopted vibe coding, and 72% say it’s not part of their professional work. Agent orchestration requires deep systems understanding and architectural judgment. That gap explains why AI tool use is broad but deep automation is still concentrated among senior engineers.
Amazon popularised the “two-pizza team” — a team small enough to be fed by two pizzas, typically 6-10 people. That model is being compressed. At the Future of Software Development workshop in Deer Valley, Utah (February 2026), a head of engineering at a 200-year-old agriculture company told The Pragmatic Engineer: “We are already seeing the end of two-pizza teams thanks to AI. Our teams are slowly but surely becoming one-pizza teams across the business.”
That’s 3-4 engineers. Around 20 engineering leaders at the same events confirmed the trend.
Rajan describes AI-native teams at Atlassian producing 2-5x more output, and he frames this as a creativity gain: “Efficiency framing is missing the point, it’s more about what you can create now with AI which you could not before.”
Laura Tacho, former CTO of DX, presented data at The Pragmatic Summit that puts the baseline in perspective: 92% of developers use AI coding assistants at least monthly, saving roughly 4 hours per week. But the results are uneven. “Some organisations are facing twice as many customer-facing incidents. At the same time, some companies are also experiencing 50% fewer incidents. AI is an accelerator, it’s a multiplier, and it is moving organisations in different directions.”
Here’s the thing though — the one-pizza team model only works when the rest of the delivery pipeline is also mature. Companies with fully automated delivery pipelines are 78% more likely to ship code more frequently with AI tools, compared to 55% for those with low pipeline automation. If your CI/CD, testing, and deployment are still half-automated, shrinking your team to one pizza is going to hurt more than it helps.
The structural logic is straightforward: 3-4 AI-augmented senior engineers, each managing specialist agents, can match or exceed the output of 8-10 mixed-seniority teams because the AI absorbs execution-level work while seniors provide the architectural direction.
But smaller teams of senior engineers only work if those engineers have the right skills — and the skills that matter have shifted.
Every skill on this list is grounded in an observable signal — a hiring practice, a tool adoption metric, a company policy.
Architectural judgment. 74% of developers expect to spend far less time writing code and far more time designing technical solutions (BairesDev Q4 2025). AI generates code at speed but it can’t evaluate business constraints or anticipate how a system needs to evolve.
Systems thinking. This is the ability to reason about how components interact across the full stack — not just the code, but operational realities, security implications, and scaling constraints. Architectural judgment tells you what to build. Systems thinking tells you what breaks when you build it.
AI output validation. The biggest frustration with AI tools, cited by 66% of developers in the Stack Overflow 2025 survey, is “solutions that are almost right, but not quite.” Farhan Thawar, VP Engineering at Shopify, expects engineers to be “90 or 95%” reliant on AI while remaining capable of identifying single-line errors themselves.
Security and governance awareness. Durkin’s “guardrails, not gates” model is worth paying attention to. When AI can generate changes faster than humans can review them, security can’t sit downstream anymore. The feedback loop has to be immediate. The governance responsibilities that now fall to senior engineers — including code review, policy enforcement, and audit readiness — are covered in detail in our governance guide.
Communication and context provision. Translating organisational context into agent-readable instructions is the human layer AI can’t replicate. Without it, agents produce output that looks correct but isn’t.
The skills conversation raises an uncomfortable question: what about the engineers who built their careers on the codified skills AI is now automating?
Gergely Orosz of The Pragmatic Engineer identified a “quiet crisis” among mid-career engineers — something discussed behind closed doors but rarely addressed publicly. Mid-career engineers (typically 3-8 years experience) are being outpaced by AI tools that replicate their codified skills and by new graduates who’ve grown up with the tools.
The structural gap is clear. Mid-level engineers don’t have the deep tacit knowledge that buffers senior engineers. But they also don’t have the AI-native fluency that new graduates demonstrate. They’ve got enough experience to feel senior but not enough tacit knowledge to be irreplaceable.
This is where your actual retention and morale problems live. Seniors are gaining leverage. Juniors are being hired less. But mid-level engineers are the operational backbone and they’re getting the least attention. This is also where the pipeline risk created by a purely senior team becomes most visible — without junior and mid-level engineers progressing, you have no pathway to the senior talent you need three to five years from now.
So what can you do about it? Pair mid-levels with seniors on architectural decisions to accelerate tacit knowledge transfer. Invest in AI tooling upskilling with dedicated time — not side-of-desk expectations. Redefine performance metrics to reward orchestration capability, not just code output. If mid-level engineers feel their progression has stalled, they leave — and rebuilding that layer is expensive.
Hiring criteria need to change alongside team structures. The judgment to know when not to trust the agent is just as important as the ability to direct it.
Shopify offers a useful template here. AI tools including Copilot and Cursor are openly allowed in coding interviews. Thawar observed that candidates who don’t use them “usually get creamed by someone who does.” But Shopify also expects engineers to spot and fix single-line errors without the AI — genuine understanding, not just prompting fluency.
For 2026, BairesDev identifies the most pressing talent gaps: 42% of project managers cite AI/ML specialists, followed by data engineers (16%) and prompt/AI application engineers (11%).
Don’t build a team entirely of 15-year veterans or entirely of AI-tool-proficient new hires. The mid-level crisis shows what happens when one layer is neglected. Aim for a mix of deep architectural experience and AI-native fluency.
And don’t abandon junior hiring entirely. Forrester’s 2026 predictions caution that companies halting junior hiring would “most likely struggle with knowledge gaps and a lack of internal growth.” If you don’t hire junior developers, you will someday never have senior developers. For a practical framework on building a headcount model around senior AI-augmented engineers — one that also accounts for pipeline health — see our decision-making guide. For the complete AI team compression overview and what it means for engineering leadership, the hub covers every dimension from labour market evidence through to planning frameworks.
Agent orchestration is the practice of directing, configuring, and supervising multiple specialist AI agents to execute development tasks — writing boilerplate, scanning vulnerabilities, updating dependencies — rather than writing code directly. The engineer provides context (business intent, system history, tradeoffs) and validates outputs. Nick Durkin (Harness) distils this as “every engineer becomes an engineering manager — not of people, but of AI agents.”
Tacit knowledge is the accumulated, experience-based understanding of a system’s history, architectural decisions, team dynamics, and business constraints that can’t be easily documented or transferred to an AI model. Stanford’s “Canaries in the Coal Mine” paper found that roles concentrated in codified knowledge are most exposed to AI automation, while tacit-knowledge-intensive roles remain stable.
A one-pizza team is 3-4 engineers — small enough to be fed by one pizza — as reported by The Pragmatic Engineer from industry events in February 2026. It contrasts with the traditional “two-pizza team” (6-10 people) popularised by Amazon. AI tools enable the smaller team to match or exceed the output of the larger one.
92% of developers use AI coding assistants at least once per month, according to DX data presented at the Pragmatic Summit in February 2026. JetBrains‘ 2025 report shows 85% use at least one AI tool. Adoption is near-universal; the differentiator is now how effectively engineers use these tools, not whether they use them.
Vibe coding means describing features in natural language and letting AI generate code with minimal technical oversight. Only about 15% of professional developers have adopted it (Stack Overflow 2025), primarily for prototyping. Agent orchestration requires deep systems understanding, architectural judgment, and active validation — it’s the rigorous, senior-level counterpart to vibe coding.
Rajeev Rajan (CTO, Atlassian) describes teams where engineers write zero lines of code directly — “it’s all agents, or orchestration of agents.” These teams produce 2-5x more output, and Rajan frames this as a creativity gain: “Efficiency framing is missing the point.”
Architectural judgment, systems thinking, AI output validation, security and governance awareness, and the ability to provide context to AI agents. Shopify (Farhan Thawar) allows AI tools in coding interviews but expects engineers to identify single-line errors themselves. BairesDev identifies AI/ML integration and system-level architecture as the top talent gaps for 2026.
Yes — mid-career engineers (3-8 years experience) face a structural squeeze. They don’t have the deep tacit knowledge that buffers seniors and they don’t have the AI-native fluency of new graduates. The Pragmatic Engineer calls this the “quiet crisis.” The emerging response involves accelerating tacit knowledge transfer and investing in AI tooling upskilling for this cohort.
DX data presented at the Pragmatic Summit (February 2026) by Laura Tacho shows developers self-report saving roughly 4 hours per week. However, results vary widely — “healthy” organisations see 50% fewer incidents while “unhealthy” ones see 2x more incidents from the same tooling.
Nick Durkin (Harness) argues that “history shows that major technological shifts do not eliminate work. They expand what is possible.” However, the nature of the work is changing. Stanford data shows entry-level employment declining while experienced roles remain stable, suggesting displacement is concentrated in routine coding tasks rather than across the profession.
What the Data Actually Shows About AI and Junior Developer Employment DeclineThe debate about whether AI is displacing junior developers has moved past the speculation stage. We have data now. But the data needs careful reading, because it does not all point in the same direction.
The strongest evidence comes from the Stanford Digital Economy Lab. Using ADP payroll records covering 3.5 to 5 million workers, researchers found significant employment declines for workers aged 22-25 in the most AI-exposed occupations. A Danish study using similar methodology found near-zero effects. And ordinary recession dynamics muddy the picture further.
So in this article we are pulling together multiple independent data sources to work out what the evidence actually supports, where it falls short, and why that matters if you are making workforce decisions right now. It connects to the broader team compression phenomenon and builds on what team compression actually means in practice.
Workers aged 22-25 in the most AI-exposed occupations experienced a 16% relative employment decline compared to the least AI-exposed occupations. That comes from the Stanford/ADP study. In absolute terms, employment for this age group has fallen roughly 20% from its peak.
Here is where it gets interesting. Workers aged 35-49 in the same AI-exposed occupations saw 6-9% employment growth over the same period. The decline is not hitting everyone. It is hitting the youngest workers in roles where AI does the most work.
That asymmetry is the finding that matters. Within the same firms, at the same time, junior workers in AI-exposed roles declined while experienced workers grew. That looks like a structural shift, not a cyclical downturn.
Independent data backs this up. NY Fed College Labor Market data shows CS graduate unemployment at 6.1%, computer engineering at 7.5%. The overall rate for young workers aged 22-27 sits at 7.4% — nearly double the national average. By Q4 2025, the underemployment rate for recent college graduates climbed to 42.5%, its highest level since 2020.
The implications for the junior employment decline on the talent pipeline are significant. But first, it is worth understanding why this particular study carries weight.
“Canaries in the Coal Mine?” was published in November 2025 by Erik Brynjolfsson, Bharat Chandar, and Ruyu Chen at the Stanford Digital Economy Lab.
The study uses ADP administrative payroll data. ADP is the largest payroll processing firm in the US, covering over 25 million workers. The researchers worked with individual-level monthly records for 3.5 to 5 million workers across tens of thousands of firms through September 2025. This is actual payroll data — not surveys, not job-posting scraping. It tracks real employment.
Occupations are classified into AI exposure quintiles using the Eloundou et al. (2023) framework. Software engineering sits in the top quintile. The study also uses Anthropic Economic Index data on the share of Claude queries involving automation versus augmentation — a second framework that independently backs up the exposure measures.
The methodological detail worth understanding is the firm-time effects controls. In plain language: the study compares workers within the same firm at the same time. If a company shrank overall, that shock gets absorbed. What remains is the differential between junior and experienced workers in AI-exposed roles within those firms.
Pre-2022 placebo tests confirm there was no pre-existing divergence before GenAI tools became widely available. The pattern showed up specifically in late 2022 and early 2023.
After publication, independent researchers using LinkedIn data from the US and UK found similar declines, as discussed in the Stanford paper itself. The core finding has been replicated.
The Stanford study is not the only evidence. Several independent sources using different methodologies point the same way.
Internship postings are shrinking. Handshake reports a 30% decline in tech-specific internship postings since 2023. Applications rose 7% over the same period. More people chasing fewer openings.
Entry-level tech hiring is down. Indeed’s Hiring Lab shows US tech job postings down 36% from February 2020 levels as of July 2025. Software engineers — the most common tech job title — were down 49% from early 2020. Entry-level tech hiring specifically dropped 25% year-over-year in 2024.
Employer sentiment is shifting. A 2024 SHRM survey found 70% of hiring managers say AI can do the jobs of interns. 57% trust AI’s work more than that of interns or recent graduates. Let that sink in for a moment.
Developer adoption is near-universal. The Stack Overflow 2025 Developer Survey shows 84% of developers now use AI tools, up 14 percentage points from 2023.
CS enrolments may follow. Forrester‘s 2026 Predictions project a 20% decline in CS enrolments as prospective students respond to deteriorating job market signals. Fewer CS graduates now could produce a senior engineer shortage in 5-10 years.
These are trailing indicators (employment data), concurrent indicators (job postings), and leading indicators (enrolment forecasts, employer sentiment). They are not all measuring the same thing, but they are all moving in the same direction.
The decline is not uniform across roles either. Developer roles like Android, Java, .NET, iOS, and web development are down 60% or more from 2020, while machine learning engineer postings are up 59%. The reallocation within tech is as telling as the overall decline.
It comes down to what kinds of knowledge AI can replace. This is also why AI is compressing teams rather than replacing programmers — the compression is selective.
Junior developers primarily supply codified knowledge. Writing boilerplate code, implementing standard patterns, debugging routine errors, writing unit tests. Well-documented, repeatable work.
Experienced engineers supply tacit knowledge. System-level architectural decisions, debugging novel failures across distributed systems, navigating stakeholder dynamics, mentoring. The kind of accumulated expertise that has never been written down and in many cases cannot be.
GenAI tools are excellent at codified tasks. The Anthropic Economic Index shows 79% of Claude Code conversations are classified as automation rather than augmentation. The specialist coding agent operates predominantly as a substitute for labour, not a complement to it.
So AI directly substitutes for the task mix that juniors perform, while it augments the work of experienced engineers who use it to amplify their existing judgement. The Stanford study confirms this: employment declines are concentrated in occupations where AI automates rather than augments, and junior roles cop the worst of it.
Wage stickiness adds another signal. Compensation has not fallen proportionally to employment. Firms are reducing headcount rather than cutting pay — which is consistent with task substitution at the entry level rather than a general softening of demand.
Humlum and Vestergaard published an NBER Working Paper (33777) examining LLM effects on earnings and hours worked in Denmark. They found “precise null effects” — ruling out impacts larger than 2% two years after ChatGPT adoption.
This is a credible study. It was discussed at the 2025 NBER Summer Institute, Chicago Booth, the Chicago Fed, Microsoft Research, and MIT Sloan FutureTech. You cannot just dismiss it.
But several structural differences explain the divergence. Denmark has stronger labour protections and collective bargaining — rapid headcount reduction is simply harder there. The US tech sector adopted GenAI coding tools faster and more aggressively. And the post-2022 US tech correction created a permissive environment for replacing junior roles with AI tooling that Denmark did not experience.
The measurement difference matters too. Denmark measures earnings and hours for existing workers. Stanford measures employment counts. Firms can stop hiring new workers (the US pattern) without changing anything for the workers they already have (the Denmark measurement). Both studies can be correct at the same time.
The honest conclusion: US data shows a real, large employment decline correlated with AI exposure. Danish data shows this outcome is not inevitable everywhere. Structural context determines whether AI exposure translates into actual displacement.
The Stanford authors are upfront about this: “While we explore a variety of alternative explanations, we caution that the facts we document may in part be influenced by factors other than generative AI.”
The post-pandemic tech correction is a genuine confound. Layoffs at Meta, Google, Amazon and others in 2022-2023 hit junior engineers hardest, and they overlap temporally with GenAI adoption. Indeed’s analysis notes that nearly half the net decline in tech postings occurred before ChatGPT’s release — but suggests AI may be preventing the rebound that would otherwise have happened.
The post-ZIRP environment compressed entry-level demand independently of AI. The Stanford study addresses this by showing results hold separately for high and low interest-rate-exposed occupations. But the temporal overlap means the two cannot be fully untangled.
Sectoral breakdown is missing. The data does not distinguish between Big Tech, mid-market SaaS, FinTech, or other segments. The decline could be concentrated in specific sectors rather than spread evenly. Non-US data beyond Denmark is sparse — we do not know whether Australia, the UK, or Canada show US-like or Denmark-like patterns. Recovery scenarios are unmodelled.
What we can say with confidence: there is a statistically significant, age-asymmetric decline in employment for 22-25 year olds in AI-exposed occupations, robust to firm-time controls and pre-existing trend testing. What remains uncertain: whether AI is the primary cause or a contributing factor, the sectoral distribution, and whether the trend will persist or reverse. The company-level evidence corroborating these figures adds further texture, but the picture is still developing. For a complete overview of how this evidence connects to engineering team strategy, see our AI team compression guide for engineering organisations.
The Stanford/ADP study found a 16% relative employment decline for workers aged 22-25 in the most AI-exposed occupations compared to the least exposed. In absolute terms, employment for this cohort has fallen approximately 20% from its peak. Meanwhile, workers aged 35-49 in the same occupations saw 6-9% growth.
“Canaries in the Coal Mine” (Brynjolfsson, Chandar, Chen, November 2025) uses ADP administrative payroll data covering 3.5-5 million US workers monthly. It classifies occupations by AI exposure quintiles and applies firm-time effects controls to isolate AI-related employment changes from company- or industry-level shocks.
Multiple factors are converging. AI tools automate the codified tasks that juniors have traditionally performed, tech sector layoffs reduced entry-level openings, and employer sentiment has shifted toward viewing AI as a substitute for intern-level labour. Entry-level tech hiring dropped 25% year-over-year in 2024.
Handshake reports a 30% decline in tech-specific internship postings since 2023. Indeed shows an 11% decline across all industries. AI tools are not the sole cause — the post-pandemic tech correction contributes — but 57% of hiring managers now trust AI’s work more than that of interns or recent graduates.
The Humlum and Vestergaard study (NBER Working Paper 33777) found “precise null effects” in Denmark — ruling out earnings or hours impacts larger than 2%. Structural differences (stronger labour protections, different AI adoption rates, different measurement approach) likely explain why Danish results diverge from the US findings.
Both likely play a role. The Stanford study uses firm-time effects controls and pre-2022 placebo tests to isolate AI-specific effects, but the temporal overlap between GenAI adoption and the post-ZIRP tech correction means the two cannot be fully pulled apart. The age-asymmetric pattern supports AI as a distinct factor.
According to NY Fed data, computer science graduate unemployment is 6.1% and computer engineering graduate unemployment is 7.5%. The overall rate for young workers aged 22-27 is 7.4%. These figures are elevated relative to historical norms for technical fields.
Codified knowledge is the documented, repeatable stuff — boilerplate code, standard patterns, routine debugging. Tacit knowledge is system-level judgement, architectural decisions, and organisational context. AI is excellent at codified tasks, which makes junior developers more vulnerable to automation than experienced engineers who have accumulated tacit expertise over years.
Amodei has stated that approximately 50% of entry-level white-collar jobs could be eliminated within five years. This is a forward projection from the CEO of Anthropic, not a peer-reviewed finding. It contextualises the trend data but should be treated as an informed prediction, not evidence.
The Stack Overflow 2025 Developer Survey shows 84% of developers now use AI tools in development, up 14 percentage points from 2023. AI performance on SWE-Bench improved from 4.4% to 71.7% of problems solved between 2023 and 2024. Adoption is not gradual — it is near-universal.
Forrester’s 2026 Predictions forecast a 20% drop in CS enrolments as prospective students respond to deteriorating job market signals. This sets up a potential feedback loop: fewer CS graduates entering the pipeline today could produce a senior engineer shortage in 5-10 years, even as AI reduces demand for entry-level workers.
AI Is Not Replacing Programmers — It Is Compressing Teams and Here Is Why That Distinction MattersEngineering teams are shrinking at companies that are growing. Shopify now tells employees to prove they can’t do something with AI before they’re allowed to request new headcount. Klarna has kept output steady while trimming engineering numbers. There’s a name for this: team compression. Igor Ryazancev coined the term in early 2026 to describe what happens when AI tools let a smaller, more senior team produce equal or greater output.
This article is part of our comprehensive guide to team compression and its implications for engineering leadership. In this piece we’re going to break down the mechanism behind compression, the data that proves it, how fast it’s spreading, and why the way you frame it changes how you plan your engineering organisation.
Team compression is a reduction in engineering headcount driven by productivity multiplication. Not strategic retreat. Not business failure. AI coding agents — Claude Code, GitHub Copilot, Cursor — let individual engineers absorb work that previously needed additional people. The output stays the same or goes up. The team gets smaller.
Ryazancev’s framing positions AI as a “productivity multiplier” rather than a replacement engine. AI handles the repetitive, time-consuming stuff. Engineers focus on problem-solving, system design, and the creative work that actually moves a product forward.
This is already happening. 58% of developers expect engineering teams to become smaller and leaner as entry-level coding tasks get automated. Some Atlassian engineering teams now have engineers writing zero lines of code — it’s all agents — and those teams are producing two to five times more output. If you want the company-level evidence from Shopify, Klarna, and Tailwind, how Shopify, Klarna, and Tailwind are responding covers that in detail.
The replacement narrative says AI eliminates the need for human engineers. That’s empirically wrong. Demand for senior engineers is strong or increasing. What’s actually happening is the same work gets done by fewer people because each person is more productive. That’s a completely different organisational dynamic.
And it leads to a different response. If you think AI is replacing engineers, the rational move is defensive — upskill, protect jobs, slow adoption. If you recognise AI is compressing teams, the rational move is strategic — restructure, rethink your hiring pipelines, redefine roles. Companies that understand compression will reshape their organisations proactively. Companies that treat this as replacement will hoard headcount or freeze in place.
Dario Amodei, Anthropic’s CEO, has estimated AI could affect roughly 50% of entry-level white-collar jobs within five years. In the compression framing, that’s a signal about scale — not a prediction of mass unemployment. The jobs change. They don’t vanish uniformly. 65% of developers expect their roles to be redefined in 2026, and of those, 74% expect to spend far less time writing code and far more time designing technical solutions.
The media defaults to “replacement” because it makes a better headline. But it leads to the wrong playbook. The broader team compression phenomenon and what it means for engineering leadership requires a different set of moves entirely.
This is the core mechanism. Anthropic’s Economic Index classifies AI interactions into two buckets: automation, where AI directly performs the task, and augmentation, where AI collaborates with a human to enhance their output.
Claude Code shows 79% automation versus only 21% augmentation. Most coding agent interactions are the AI doing the work, not assisting a human doing the work. Compare that to Claude.ai — the chatbot — which sits at 49% automation. The agent form factor shifts the balance dramatically toward autonomous task completion.
Here’s what that means in practice. Automation displaces codified, repeatable, well-specified work — the stuff you’d typically hand to a junior developer. Augmentation amplifies judgment-intensive work that relies on tacit knowledge — system design, architectural decisions, stakeholder communication. The senior engineer role is expanding into what Justice Erolin, CTO at BairesDev, describes as “part architect, part AI orchestrator, and part systems-level problem solver.”
The impact is uneven by design. Junior developers face compression because their work overlaps heavily with what AI automates. Senior engineers get augmented because their work requires the kind of contextual judgment AI can’t replicate. As Brynjolfsson and colleagues at the Stanford Digital Economy Lab found, AI is “automating the codifiable, checkable tasks that historically justified entry-level headcount, while complementing the judgment-, client-, and process-intensive tasks performed by experienced workers.”
The numbers back this up. Early-career workers aged 22 to 25 in AI-exposed occupations experienced a 16% relative employment decline. Employment for experienced workers in those same occupations increased 6 to 9%. For the labour market data behind junior developer decline, the evidence is substantial. For what the senior engineer role is becoming, the shift is already underway.
The Anthropic Economic Index analysed 500,000 coding-related interactions across Claude.ai and Claude Code. This is behavioural data — what developers actually do — not survey data or projections.
Two automation subtypes stand out. Directive interactions (43.8% of Claude Code use) are where the developer describes a task and the AI completes it end-to-end. Feedback loop interactions (35.8%) are autonomous but iterative — the developer pastes error messages back, the AI adjusts, and the cycle repeats until the task is done. Together, those two patterns make up the 79% automation figure.
The automation story isn’t one thing. Some of it is fully hands-off. Some needs a human in the loop but still takes the human out of the implementation work. Anthropic’s own researchers note that “more capable agentic systems will likely require progressively less user input.” That trend line should get your attention.
There’s also a clear adoption gap by company size. Startup work accounted for 33% of Claude Code conversations versus only 13% for enterprise. Startups have fewer legacy constraints and faster adoption cycles. Large organisations are catching up, but more slowly. If you’re at a big company, know that the startups competing with you are already further along.
Adoption is broad. 92% of developers use AI coding assistants at least once a month. JetBrains puts the figure at 85%. Pick your source — the range is 84 to 92%, and the direction is up.
But broad adoption doesn’t equal deep automation. Only about 15% of developers have adopted vibe coding professionally. 72% say it’s not part of their work at all. And there’s good reason for the gap: 66% of developers cite “AI solutions that are almost right, but not quite” as their biggest frustration. Sound familiar?
So the picture is nuanced. Nearly every developer has access to AI tools. The fully autonomous workflows that drive the deepest compression are still a minority practice. But compression will accelerate as that gap closes. And this isn’t a one-time adjustment — the tools improve quarterly, and each improvement shifts more work from augmentation to automation.
The compression framing gives you a concrete playbook. It demands three shifts.
First, rethink your hiring ratios. Fewer juniors, more seniors, different onboarding. 42% of project managers already identify AI/ML specialists as the biggest talent gap for 2026. That tells you where the demand is heading.
Second, plan for the pipeline problem. If you stop developing juniors now, you won’t have seniors in 2030. Prashanth Chandrasekar, Stack Overflow’s CEO, put it plainly: “If you don’t hire junior developers, you’ll someday never have senior developers.” Internship postings in tech have dropped 30% since 2023 while applications have risen 7%. The long-term pipeline consequences of pausing junior hiring are a downstream risk most organisations haven’t accounted for.
Third, redefine what “team size” means for output planning. The old headcount-to-output ratios don’t hold when each engineer is augmented by AI. You need new benchmarks.
The distinction also changes how you talk to your board. “We are compressing teams with AI” is a productivity story — it signals strategic sophistication. “AI is replacing our engineers” is a risk story that triggers defensive responses. The framing matters for capital allocation.
Forrester’s advice is worth repeating: “Don’t abandon entry-level hiring.” Someone trained on your systems with AI assistance might outperform a senior hire who has never touched these tools. The compression thesis doesn’t mean you stop investing in people. It means you invest differently.
Something else is going on. AI is automating the codified, repeatable tasks that junior developers typically perform — but it’s not eliminating the need for developers altogether. The result is team compression: fewer juniors get hired because AI absorbs their task portfolio, while senior engineers become more productive. The 16% relative employment decline for ages 22-25 in AI-exposed occupations reflects this compression dynamic, not wholesale replacement.
The Anthropic Economic Index is a recurring research series from Anthropic that analyses how Claude is actually used across the economy. It classifies hundreds of thousands of real coding interactions as either automation (AI does the task) or augmentation (AI assists a human). Its finding that 79% of Claude Code interactions are automation — compared to 49% for the Claude.ai chatbot — gives you empirical evidence that coding agents are qualitatively different from AI assistants. That’s a meaningful distinction when you’re planning team structures.
It’s not hype. Shopify requires AI-first approaches before approving new headcount. Klarna has reduced engineering headcount while maintaining or increasing output. Some Atlassian engineering teams have engineers writing zero lines of code while producing two to five times more output. These are structural decisions, not experiments.
Adoption is broad but uneven. DX data shows 92% of developers use AI coding assistants monthly. However, only 15% report using vibe coding professionally. The gap between broad tool access and deep autonomous use is where the adoption frontier sits right now.
Directive interactions (43.8% of Claude Code use) happen when a developer describes a task and the AI completes it end-to-end with minimal further input. Feedback loop interactions (35.8%) are autonomous but iterative — the developer provides error messages or validation, the AI adjusts, and the cycle repeats. Both count as automation, not augmentation. The developer isn’t doing the implementation work in either case.
Dario Amodei, Anthropic’s CEO, has estimated AI could affect approximately 50% of entry-level white-collar jobs within five years. In the context of team compression, this signals the scale of the shift rather than predicting mass unemployment. It reflects a world where entry-level task portfolios are increasingly automated, changing the composition of teams rather than eliminating the need for human engineers.
“We are compressing teams with AI” is a productivity and efficiency narrative — it signals strategic sophistication. “AI is replacing our engineers” is a risk narrative that triggers defensive board responses. The compression framing positions headcount reduction as a deliberate investment in AI-augmented productivity rather than an admission of disruption. How you frame it determines whether your board backs the strategy or pumps the brakes.
If companies pause junior hiring because AI handles junior-tier tasks, the junior developers who would have become senior engineers in five to ten years never get developed. That creates a foreseeable senior engineer shortage in 2030-2035. Internship postings in tech have declined 30% since 2023 while applications have risen 7% — the pipeline is already contracting. This is one of those problems that’s easy to ignore now and very expensive to fix later.
No. Vibe coding is one specific workflow — you describe what you want in natural language and hand implementation entirely to AI. Only about 15% of developers use vibe coding professionally. Team compression is the organisational outcome that emerges from many AI-augmented workflows, of which vibe coding is the most extreme but least common.
Compression is further along at startups. Anthropic’s data shows 33% of Claude Code conversations serve startup work versus only 13% for enterprise applications. Startups have fewer legacy constraints, smaller teams already, and faster adoption cycles. Large enterprises are experiencing compression more slowly because of existing team structures and longer procurement cycles for AI tools. If you’re enterprise, the startups in your space are already ahead on this.
What the AI Memory Shortage Means for Tech Companies and What Comes NextThe global memory market has a problem. AI infrastructure buildout has reallocated semiconductor manufacturing capacity away from conventional memory toward high-bandwidth memory (HBM) for AI accelerators, and the downstream effects are reaching every corner of the tech industry. DRAM prices have more than doubled through 2025, and analysts project further significant increases through 2026.
If you’re running a tech company, this affects your cloud bills, your hardware refresh plans, your product economics, and your AI deployment costs. This series maps the full picture — what’s happening, why, who is winning and losing, and what you can do about it.
Explore the full series:
The AI memory shortage is a manufacturing capacity problem, not a materials shortage. Semiconductor fabs — principally Samsung, SK Hynix, and Micron — have reallocated wafer production from conventional DRAM (used in PCs, phones, and servers) to high-bandwidth memory (HBM), which AI accelerators like Nvidia GPUs require. Because one HBM stack uses roughly three times the wafer capacity of equivalent standard DRAM, the pivot to HBM is compressing supply across every other memory-dependent market.
Unlike the pandemic-era chip shortage, this shortage stems from deliberate manufacturing allocation decisions, not external supply disruption. When a fab pivots production lines to HBM, the conventional DRAM those wafers would have produced simply does not exist. There is no buffer, no substitute, no emergency reserve.
The tipping point came when hyperscaler capital expenditure commitments locked in allocation priorities at a scale that pushed the market into acute shortage. OpenAI‘s Stargate initiative alone is projected to consume up to 40% of global DRAM output. Google, Amazon, Microsoft, and Meta placed open-ended orders with memory suppliers, indicating they would accept as much supply as available regardless of cost.
IDC‘s Francisco Jeronimo put it plainly: “For an industry that has long been characterised by boom-and-bust cycles, this time is different.”
Understanding why AI consumes so much memory helps explain why these allocation decisions are unlikely to reverse any time soon. For the full technical explanation of how we got here, see How AI killed the memory supply chain and why everything else is paying for it.
Large language models are not compute-limited — they are memory-bandwidth-limited. The “memory wall” means GPUs can only process data as fast as memory can deliver it, so AI accelerators require high-bandwidth memory that conventional DRAM cannot provide. For inference, the KV cache (the working memory that tracks active conversation state) grows with every concurrent user and every longer context window, making memory demand non-linear as AI products scale.
A single large LLM inference session requires significantly more high-speed memory than comparable traditional compute workloads. HBM is not optional for AI at scale — adding more GPU compute without proportional HBM bandwidth is largely wasted investment. SemiAnalysis estimates that HBM constitutes 50% or more of the cost of the packaged GPU.
For companies building AI products, inference memory costs are the dominant ongoing infrastructure expense, not training costs. Micron predicts the total HBM market will grow from $35 billion in 2025 to $100 billion by 2028 — a figure larger than the entire DRAM market in 2024.
We cover the technical deep dive on the memory wall in How AI killed the memory supply chain, and the cost management implications in How to run AI workloads and manage infrastructure costs.
Memory prices have already risen sharply and forecasters expect further increases through at least 2026. DRAM prices overall have risen 172% since the shortage began. TrendForce revised its Q1 2026 conventional DRAM contract price forecast to +90–95% quarter-on-quarter — more than double the previous record quarterly increase. The global memory market is forecast to reach $551.6 billion in 2026.
The price escalation is broad-based. DDR5 for enterprise servers, LPDDR5X for smartphones, and HBM for AI accelerators have all seen steep increases. Enterprise SSDs have been prioritised over consumer NAND, pushing NAND flash contract prices up 55–60% as well. Spot market buyers — those without long-term supply agreements — are experiencing the full force of this volatility.
The current analyst consensus is that price normalisation is unlikely before 2027–2028 at the earliest. For the full price data and market forecasts, see DRAM prices in 2026 have doubled and the numbers are getting worse.
The three major memory manufacturers — Samsung, SK Hynix, and Micron — are the primary beneficiaries. Constrained supply inflates their margins on both HBM and conventional DRAM, and together they control more than 90% of global memory chip production. Hyperscalers secured multi-year long-term agreements (LTAs) ahead of the shortage and have preferential supply access. PC makers, smartphone OEMs, gaming hardware vendors, and most enterprise buyers are absorbing cost increases with limited recourse.
On the buyer side, the LTA is the mechanism that separates the insulated from the exposed. Those without LTAs — which includes most enterprise companies and virtually all SMBs — compete on the volatile spot market. Google has stationed procurement executives in South Korea. Microsoft executives staged a walkout at SK Hynix negotiations. This is not gentle commerce.
The cost impact on PC OEMs is severe: memory now makes up approximately 35% of Dell and HP PC build materials, up from 15–18%. Apple is paying a 230% premium on LPDDR5X for the iPhone 17, even with secured LTAs.
For the full analysis of market power dynamics and LTA mechanics, see Samsung, SK Hynix, Micron, and the hyperscalers who locked up all the memory.
The PC and smartphone markets are experiencing the sharpest downstream impact. IDC projects a 10–11% decline in PC shipments and an 8–9% decline in smartphone shipments in 2026 as memory costs make affordable devices financially unviable. Gaming GPU supply has been depressed, with Nvidia unable to produce discrete gaming cards at previous volumes. Enterprise hardware prices are rising, and cloud infrastructure costs are beginning to increase.
The PC market is being hit from two directions: memory costs are rising while demand is also under pressure. Gartner projects entry-level laptops under $500 will become financially unviable within two years if current cost trajectories persist. PC vendors — Lenovo, Dell, HP, Acer, ASUS — have warned clients of 15–20% price hikes and contract resets.
Smartphone OEMs face a similar squeeze. Apple is partially insulated by LTAs but is still paying premium prices. Akihabara retailers in Tokyo have already implemented RAM purchase limits to prevent hoarding. The irony in the PC space is that AI PCs require a minimum of 16GB RAM for Microsoft Copilot+, but adding more RAM has become prohibitively expensive — the AI PC narrative is undermined by the same AI demand that created it.
For the full breakdown by market segment, see How the AI memory crunch is wrecking PC, smartphone, and gaming GPU markets.
Geopolitics is actively shaping who can produce memory, who can sell it to whom, and where new capacity will be built. US export controls restrict advanced semiconductor manufacturing equipment from reaching China. The US has simultaneously threatened 100% tariffs on Samsung and SK Hynix chips — punishing the South Korean suppliers that Western buyers currently depend on while trying to build domestic alternatives.
China’s domestic memory response centres on CXMT (ChangXin Memory Technologies), which is targeting HBM3 mass production by end 2026 and has filed for a $4.2 billion IPO. But US export controls constrain CXMT’s access to ASML lithography equipment and other tooling, with localisation at only around 20%. SemiAnalysis projected CXMT would account for nearly 15% of global DRAM production by 2026 — meaningful, but not enough to resolve the shortage.
On the US side, the CHIPS Act is funding domestic capacity — primarily Micron’s $100 billion-plus fab complex in New York state. But fab construction and qualification timelines are 2–4 years. US Commerce Secretary Howard Lutnick made the geopolitical stakes explicit at Micron’s groundbreaking: “Everyone who wants to build memory has two choices: they can pay a 100% tariff, or they can build in America.”
For your planning purposes, the geopolitical dimension increases the risk that supply chain disruption is not simply a market cycle but a structural feature of the memory market for the rest of the decade. For the full analysis, see China, DRAM, and export controls: the geopolitical factors shaping memory supply.
The consensus among analysts is that meaningful supply relief will not arrive before 2027, and the shortage may well persist into 2028. Building new fab capacity or converting existing capacity takes 2–4 years. All three major manufacturers have announced capacity expansion programmes, but these investments will not produce marketable memory for 12–24 months from their respective commitment dates.
Intel CEO Lip-Bu Tan was direct about it at the Cisco AI Summit in February 2026: “There’s no relief until 2028.”
The specific investments in train — Samsung’s Pyeongtaek P5 fab expansion, SK Hynix’s $13 billion new HBM assembly plant, and Micron’s New York state fab complex — are all real commitments, but none will translate to significant additional supply before 2027.
There is a meaningful uncertainty in forecasts. Demand destruction is an underaddressed scenario. If AI investment slows due to business model failures or algorithmic efficiency gains — smaller models requiring less memory — demand could moderate before new supply comes online, potentially bringing forward normalisation. This is not the consensus scenario but it is worth tracking.
You should plan infrastructure budgets on the assumption that memory-intensive costs will remain elevated through at least 2027. For the full fab timeline analysis, see When will the memory shortage end? Fab timelines and what they tell us.
The highest-leverage responses are: optimise AI inference workloads using quantisation and architectural efficiency techniques to reduce memory footprint per inference call; evaluate infrastructure procurement decisions now — whether to commit to reserved cloud instances, move to bare metal, or maintain on-demand flexibility; and model infrastructure costs at current elevated prices to stress-test product economics rather than assuming prices will return to 2023 levels.
Quantisation — reducing model weight precision from 16-bit to 4-bit or 8-bit — is the most accessible immediate mitigation. Halving precision often allows GPU Tensor Cores to roughly double throughput, improving speed, memory usage, and energy efficiency simultaneously.
Infrastructure procurement decisions involve real trade-offs. Reserved cloud instances offer price certainty but require commitment. Bare metal offers the best per-unit economics but requires operational overhead. On-demand cloud offers flexibility but maximum price exposure. The right choice depends on your workload predictability and company size.
FinOps discipline is increasingly necessary. As hyperscalers begin passing memory cost increases through to cloud customers, infrastructure costs that were predictable are becoming volatile. Proactive tagging, usage monitoring, and commitment evaluation are baseline requirements.
If you don’t have the scale to sign long-term agreements directly with memory manufacturers, focus on the levers you do control: reducing per-inference memory consumption, locking in cloud commitment pricing where available, and timing hardware refreshes to avoid peak spot market pricing. For the full enterprise response playbook, see How to run AI workloads and manage infrastructure costs during the memory shortage.
HBM is a specialised DRAM architecture that stacks up to 12 memory dies vertically using through-silicon vias (TSVs), delivering dramatically higher data transfer bandwidth than conventional DDR or LPDDR modules. One HBM stack uses approximately three times the wafer capacity of equivalent standard DRAM, which is the core reason the shift to HBM production is compressing conventional memory supply. See How AI killed the memory supply chain for the full technical explanation.
The 2020–2023 shortage was driven by pandemic supply chain disruptions — external shocks to production and logistics. The 2024–2026 shortage is driven by deliberate manufacturing strategy: fabs are actively choosing to reallocate wafer capacity to higher-margin HBM because AI demand is so strong. This makes the resolution timeline different and slower — the market is waiting for capital investment to produce new capacity, not for disruptions to clear. See DRAM prices in 2026 for the comparative analysis.
An LTA is a multi-year supply contract between a memory buyer and a manufacturer that locks in committed volumes. Hyperscalers signed these agreements before the shortage tightened, securing preferential supply access. Buyers without LTAs — which includes most enterprise companies and virtually all SMBs — compete on the volatile spot market. See Samsung, SK Hynix, Micron, and the hyperscalers for a full explanation of how LTAs work.
Yes, with a lag. Hyperscalers have so far largely absorbed increased memory costs rather than passing them immediately to cloud customers, but this is not sustainable indefinitely. As memory contract terms roll over and new agreements are signed at elevated prices, cloud pricing increases are expected to follow. See How to run AI workloads and manage infrastructure costs for how to prepare.
CXMT represents China’s strategic response but its near-term capacity is constrained by US export controls on advanced semiconductor manufacturing equipment. CXMT is targeting HBM3 mass production by end 2026 and could meaningfully supplement conventional DRAM supply, but it cannot substitute for SK Hynix, Samsung, or Micron at current capacity levels. See China, DRAM, and export controls for the full analysis.
It depends on your timeline and workload certainty. If you have predictable, near-term memory-intensive requirements, buying now at current elevated prices may be preferable to buying at potentially higher spot prices in 12–18 months. If requirements are uncertain or 2–3 years out, waiting for supply normalisation may yield better unit economics. See When will the memory shortage end? for supply timeline analysis.
Quantisation reduces the numerical precision of AI model weights — for example, from 16-bit floating point to 8-bit or 4-bit integers. Lower-precision weights require less memory to store and less bandwidth to transfer during inference, reducing the GPU memory footprint per inference call. Halving precision typically allows Tensor Cores to roughly double throughput. Dropbox uses quantisation in production for its Dash product. See How to run AI workloads and manage infrastructure costs for the full optimisation playbook.
When Will the Memory Shortage End and What the Fab Timelines Actually Tell UsThe memory shortage is structural, and the question sitting on top of every infrastructure budget is the same: when does it end? The people who should know can’t agree. Intel’s CEO says no relief until 2028. Counterpoint Research reckons Q4 2027. IDC says it’s not a cyclical blip at all — it’s a permanent structural reset.
Why can’t they agree? Because they’re making different bets on two things: how fast AI demand will keep growing and how quickly HBM yields will improve. This article maps every announced fab timeline against demand, explains why prices will come down a lot more slowly than they went up, and gives you a planning horizon you can actually work with. For the full picture of how AI created this mess, see the full AI memory shortage story.
Intel CEO Lip-Bu Tan told the Cisco AI Summit in February 2026 there is “no relief until 2028” — the most pessimistic mainstream forecast from a named executive. Counterpoint Research analyst Tarun Pathak puts Q4 2027 as the earliest point where supply and demand curves could cross.
Micron CEO Sanjay Mehrotra projects the HBM market hitting $100 billion by 2028. Micron is sold out for 2026, with demand having “far outpaced our ability to supply that memory.” TrendForce projects the memory market surging to $842.7 billion in 2027. And IDC’s Nabilia Popal doesn’t mince words: “This is not just a temporary situation. This is going to result in a structural reset of the entire industry.”
So here’s where the disagreement lands. It’s all about two variables: how fast AI demand will grow and how quickly HBM yields will improve. Get both wrong in the same direction and you’re off by years.
The spread between Q4 2027 and 2028+ is the window you need to plan around. For the current price data, see our analysis of why DRAM prices have doubled.
A new fab takes at least 18 months to build — and that’s just the physical construction. Getting yields up to meaningful volume adds more months on top. Each facility costs $15 billion or more. The physics don’t compress, no matter how much capital you throw at them.
The timing problem goes back to the 2022-2023 memory bust. Prices tanked, Samsung cut production by 50%, and once things recovered the industry was gun-shy about expanding. Then AI demand exploded and caught everyone short.
HBM production makes it worse. Each HBM chip stacks up to 12 thinned-down DRAM dies. When Micron makes one bit of HBM, it has to forgo making three bits of conventional memory — the “three-to-one basis.” And even when DRAM wafers are available, packaging capacity at TSMC (CoWoS) and SK Hynix (MR-MUF) puts a hard cap on how many HBM units actually get assembled.
Shawn DuBravac at the Global Electronics Association reckons yield improvement is the faster path: better stacking efficiency and tighter coordination between memory suppliers and AI chip designers will deliver gains before new fabs do. For more on the packaging bottleneck, see how AI broke the memory supply chain.
The 2027 wave brings three facilities: Micron Singapore (HBM fab), Micron Taiwan (retooled from a PSMC acquisition, H2 2027), and SK Hynix’s $13 billion HBM packaging facility at Cheongju — the world’s largest HBM assembly plant.
The 2028 wave adds SK Hynix West Lafayette, Indiana (CHIPS Act-funded) and Samsung Pyeongtaek with a new production line.
Then there’s the Micron Clay, New York megafab — the single largest planned capacity addition. It broke ground in January 2026, but first production will not arrive until 2030. The facility that would have made the biggest near-term difference is now a 2030 story.
Here’s the catch: all three 2027 facilities are purpose-built for HBM. They don’t directly relieve conventional DRAM supply. And new capacity has to outpace growing demand, not just match it. NVIDIA’s B300 uses eight HBM chips, each stacking 12 DRAM dies. HBM4 goes to 16. IDC expects just 16% year-over-year DRAM supply growth in 2026, well below AI-driven demand growth. These new fabs may merely keep pace rather than creating any surplus.
Most shortage coverage skips this bit, and it’s the part that matters most for your budget.
Kim at Mkecon Insights puts it plainly: “In general, economists find that prices come down much more slowly and reluctantly than they go up. DRAM today is unlikely to be an exception.”
It’s called price asymmetry. Three things drive it.
First, contract reset cycles. OEMs like Dell and HP purchase memory in bulk about a year in advance. When spot conditions ease, your contracted pricing stays elevated until the next reset window. You’re locked in.
Second, vendor inventory protection. Pricing floors and volume commitments slow the pass-through of falling costs. Procurement leverage hinges on strategic alignment, not volume. Hyperscalers lock in supply. Everyone else fights over what’s left.
Third, vendor pricing discipline. Manufacturers who sank $15 billion into each fab need to recoup that capital. Analysts assess that these price increases appear more durable than temporary. They’re not going to race to the bottom.
For your budgeting: even when supply relief arrives, prices won’t snap back. Model a gradual 12-18 month decline from the inflection point, not a step function.
Here’s the concrete planning guidance.
Upside scenario (Q4 2027): The earliest possible inflection, if HBM yield ramps exceed expectations and AI demand moderates. This is the Counterpoint Research view. Don’t build your base budget on this.
Base case (gradual improvement through 2028): New fabs come online, yield improves, but price asymmetry means prices decline slowly. Meaningful relief arrives mid-2028. Use this for your 2026-2028 infrastructure roadmap.
Downside scenario (2029-2030): If HBM4 demand accelerates before capacity catches up, or if the Micron Clay NY delay cascades, full normalisation doesn’t arrive until 2030.
For hardware refresh: if servers need replacing in 2026, waiting for price relief is not viable. The earliest meaningful supply improvement is 18+ months away. Budget at current elevated pricing and move on.
For cloud commitments: multi-year contracts signed in 2026 lock in shortage pricing. Consider shorter terms with renegotiation windows aligned to the Q4 2027-2028 relief window. Give yourself room to move.
Even after normalisation, memory pricing may settle at a higher equilibrium than pre-2025 levels. The permanent wafer reallocation toward HBM means conventional DRAM carries a structural premium until manufacturers reverse course. For more on timing these decisions, see managing infrastructure costs during the shortage.
Yes, on both sides.
The fastest path to relief is a demand contraction. If enterprise AI deployments don’t prove ROI — a risk Deloitte‘s 2026 semiconductor outlook flags — spending could contract before new capacity arrives. Deloitte notes the industry “has placed all its eggs in the AI basket”. If those eggs don’t hatch, the surplus comes early.
On the supply side, HBM yield improvements could accelerate relief faster than new construction. SK Hynix’s MR-MUF process and hybrid bonding could enable denser stacks without waiting for new fabs.
The downside scenarios deserve equal weight. HBM4 goes to 16 stacked dies versus 12 today, consuming more wafer area before new capacity arrives. McKinsey predicts $7 trillion in data centre spending by 2030. If that accelerates, demand could outrun even the most aggressive fab plans.
Unlike the COVID chip shortage, which was cyclical, this one is structural. Resolution requires active vendor decisions to reverse wafer allocation, not just the passage of time.
Your planning assumption: gradual improvement through 2028, with demand-side wildcards as the main variables to watch. For the full story, see what the memory supply crisis means for tech companies and what comes next.
The credible range is Q4 2027 (Counterpoint Research) to 2028+ (Intel CEO’s forecast). Full price normalisation may not come until 2030, given the Micron Clay NY delay and price asymmetry. Plan for gradual improvement through 2028 as your base case.
The COVID shortage was cyclical — a temporary demand spike that normalised, and supply caught up. This one is structural — IDC frames it as a permanent reallocation of wafer capacity toward HBM. It doesn’t self-correct when demand eases. Manufacturers have to deliberately reverse it, and that requires them to be confident conventional DRAM demand will recover.
If servers need replacing in 2026, waiting isn’t an option — the earliest meaningful supply improvement is 18+ months out. Budget at current elevated pricing. For discretionary upgrades, consider deferring to late 2027 or early 2028.
The AI-driven memory shortage is forecast to persist through at least 2027, with most analysts projecting gradual improvement in 2028. It specifically affects DRAM and HBM supply; broader chip shortages vary by segment.
Price asymmetry means memory prices fall more slowly than they rise. Contract reset cycles and vendor pricing strategy slow the decline. Pre-2025 pricing levels may not return because of the structural premium from permanent wafer reallocation toward HBM.
A new fab takes 18+ months to build and costs $15 billion or more. Getting yields up to meaningful volume adds further months. The physics of cleanroom construction, equipment installation, and yield optimisation simply can’t be compressed with more money.
HBM (high-bandwidth memory) stacks multiple DRAM dies vertically to provide the bandwidth AI GPUs need. Each HBM unit consumes three times the wafer area of conventional DRAM. When manufacturers allocate cleanroom capacity to HBM, they directly reduce conventional DRAM output. It’s a zero-sum game.
The Micron Clay, New York megafab was the largest planned capacity addition, but it’s been delayed to 2030. SK Hynix Cheongju (2027) and Micron Singapore (2027) are the nearest major additions, though both are HBM-focused.
The CHIPS Act funds US-based fab construction, including Micron’s Clay NY facility and SK Hynix’s West Lafayette plant. But these are targeting 2028-2030 production starts, so CHIPS Act investment doesn’t speed up near-term relief.
Consider shorter commitment terms with renegotiation windows aligned to Q4 2027-2028. Multi-year contracts signed in 2026 lock in shortage pricing. Shorter terms give you the flexibility to renegotiate when supply improves.
IDC uses “structural reset” to describe the permanent reallocation of wafer capacity from conventional DRAM to HBM. Unlike a cyclical shortage, this shift doesn’t self-correct when demand eases — manufacturers have to deliberately reverse it, and that requires confident forecasts of conventional DRAM demand recovery.
Yes. If enterprise AI deployments fail to prove ROI (a risk Deloitte flags), AI infrastructure spending could contract before new capacity arrives. That would reduce HBM demand, free up wafer capacity for conventional DRAM, and potentially create oversupply after 2027.