On 14 January 2026, the White House issued Proclamation 11002 — a 25% import duty on AI chips. That is not a news headline, it is a line item in your next hardware budget.
If your company procures AI GPU hardware, this changes what you pay. Significantly. And if you are not one of the handful of hyperscalers that have secured effective tariff immunity, you are paying the full 25% on every chip you import. That is Amazon, Google, and Microsoft on one side. Everyone else — including every SMB tech company — on the other.
On top of that: the Remote Access Security Act (RASA) is changing the compliance picture for cloud GPU rental, HBM3e memory supply is sold out through end of 2026, and the AI OVERWATCH Act introduces a potential Congressional veto over export licences.
Here is what you need to know. We are going to give you a cost model, a two-tier market explanation, a cloud-versus-owned decision framework, a supply timeline, and a procurement hedging checklist. Read this and you will be able to brief your CFO on the dollar impact and your procurement team on what to do next.
What is the 25% AI chip tariff and which companies are actually exposed?
Section 232 of the Trade Expansion Act of 1962 gives the President authority to restrict imports on national-security grounds — no Congressional approval required. The January 2026 proclamation used that authority to apply a 25% ad valorem duty on semiconductors meeting specific technical parameters. The White House fact sheet names them directly: the NVIDIA H200 and AMD MI325X.
The tariff hits at the US border on first importation. It is not a recurring sales tax — you pay it once, when the hardware crosses the border. But you pay it upfront, at full value, before a single GPU reaches your rack.
The Department of Commerce’s national-security investigation found that the US manufactures only around 10% of the chips it needs. Most AI GPUs — NVIDIA H200, NVIDIA Blackwell, AMD MI325X — are made by TSMC in Taiwan. That Taiwanese origin means full tariff exposure for any company importing them into the US.
If you are buying GPU hardware for on-premise deployment, you are directly exposed. Cloud GPU customers do not pay the import duty itself, but they face cost pass-through from providers and RASA compliance exposure for cross-border operations — more on that below.
One thing worth knowing: the tariff applies per chip. An 8-GPU server carries 8× the tariff load. That matters when you are modelling cluster costs.
For a full breakdown of how the Proclamation works and the Section 232 mechanics behind it, read how the 25% tariff works and what it means for GPU procurement.
Who gets tariff relief — and who pays the full 25%?
The tariff is not applied equally. A mechanism built into the exemption framework creates a two-tier market — hyperscalers on one side, everyone else on the other.
Under the US-Taiwan semiconductor deal, companies that commit capital to TSMC Arizona’s construction can import AI chips at a multiple of their invested capacity without paying the 25% tariff: 2.5× during construction, dropping to 1.5× once operational. Amazon, Google, and Microsoft committed at the scale needed to make that multiplier cover their GPU procurement volumes. Their effective tariff rate is near zero.
TSMC’s CFO confirmed the company has already committed $165 billion in US investment, with Fab 21 Phase 1 having entered volume production in Q4 2024. The companies that anchored that investment get access to the import quota.
The exemption is designed to reward domestic semiconductor manufacturing investment. The practical outcome: hyperscalers absorb GPU tariff costs as a rounding error; SMB companies face a 25% structural cost premium on every procurement cycle. The CFR notes the arrangement is politically contingent, but it is the operating reality for now.
The investment threshold required to access the multiplier is in the billions. There is no SMB-accessible equivalent. Assume the full 25% applies to you.
For the full account of the deal structure and who benefits from the hyperscaler tariff exemptions, see the US-Taiwan $500 billion semiconductor deal — who actually benefits and who bears the cost.
How much does the 25% tariff actually add to an AI infrastructure budget?
Let us put numbers to it. The reference GPU is the NVIDIA H200 SXM. Chinese technology companies have ordered more than two million H200 processors at roughly $27,000 per unit, which gives us $27,000 as a solid market reference price.
At $27,000 per unit, the 25% tariff adds $6,750 per chip. Scale that out:
4-unit cluster: $108,000 base + $27,000 tariff = $135,000 total 8-unit cluster: $216,000 base + $54,000 tariff = $270,000 total 10-unit cluster: $270,000 base + $67,500 tariff = $337,500 total 16-unit cluster: $432,000 base + $108,000 tariff = $540,000 total 32-unit cluster: $864,000 base + $216,000 tariff = $1,080,000 total
The 10-unit cluster is a useful anchor: the tariff alone costs you the equivalent of a complete extra GPU unit, paid to customs at order time.
Here is the part that catches teams off guard. Nvidia requires full advance payment with no option to cancel or change configurations afterward. The entire tariff cost — $54,000 on an 8-GPU cluster — must be funded at order placement. For budget-constrained teams, this is a cash-flow planning issue as much as a cost issue.
AMD MI325X is subject to the same 25% tariff treatment. It offers supply-chain diversification and potentially more flexible payment terms via distributors, but it is not a tariff escape route. And Deloitte’s 2026 semiconductor industry outlook projects memory shortages will drive 50% price spikes by mid-year 2026 — the $27,000 reference price may be the floor, not the ceiling.
One supply factor that is easy to underestimate: the 2M-unit H200 backlog from Chinese technology companies. Understanding why China’s H200 demand affects Western GPU availability — and why those buyers cannot simply switch to domestic alternatives — is essential context for reading lead times and spot market signals in 2026.
What does the Remote Access Security Act mean for companies renting cloud GPUs?
The Remote Access Security Act (RASA) passed the US House by a 369-22 vote on 12 January 2026. It is in the Senate Banking, Housing, and Urban Affairs Committee. Not yet law, but Latham & Watkins describes its passage as having a fair chance.
What RASA does is close the cloud loophole. Before RASA, a company could rent access to export-controlled AI chips through an offshore data centre without triggering export licence requirements — because no hardware crossed a border. RASA treats remote access to restricted AI chips as equivalent to a physical shipment.
The enforcement precedents that drove RASA are concrete. INF Tech (Shanghai) accessed 2,300 NVIDIA GPUs — 32 GB200 servers worth an estimated $100 million — by renting servers from an Indonesian telecom company. Tencent secured $1.2 billion in contracts for 15,000 Blackwell processors via Japanese provider Datasection. RASA is a direct response.
If your company is US-domiciled and using AWS, Azure, or GCP for domestic operations, you have no direct new compliance obligations under RASA. The compliance burden sits with the cloud provider.
The risks are indirect. Providers may restrict access to certain GPU types or regions. They may pass compliance costs into pricing. And if your company has operations in Indonesia, Singapore, or other export-control-adjacent jurisdictions, cloud GPU access from those endpoints may face restriction or additional scrutiny.
Worth noting: BIS has already signalled that providing access to advanced AI chips may trigger existing EAR Part 744 catchall controls — even before RASA becomes law. The intent is being enforced ahead of the legislation.
For a full legislative analysis, including the AI Overwatch Act oversight risk and the trusted US person framework, see the AI OVERWATCH Act and Remote Access Security Act — how new laws are reshaping cloud GPU access.
What does the 2026–2027 memory shortage mean for procurement timing?
GPU availability in 2026–2027 is not primarily constrained by chip fabrication capacity. It is constrained by High Bandwidth Memory (HBM) — the stacked DRAM integrated onto GPU dies that makes large AI models runnable in real time. If HBM is not available, the GPU cannot be assembled. And right now, HBM is not available.
“We have already sold out our entire 2026 HBM supply” — that is SK Hynix CFO Kim Jae-joon. SK Hynix holds roughly 50% of the HBM market. No unallocated HBM3e capacity is available to new buyers until at least Q1 2027. Micron‘s CEO said the same thing: “Our HBM capacity for calendar 2025 and 2026 is fully booked.”
HBM4, required for Blackwell-class GPUs, is delayed. All three major HBM suppliers were reportedly forced to redesign their HBM4 products to meet NVIDIA’s upward-revised bandwidth specifications. NVIDIA disputes the characterisation, stating its HBM4 partners “remain on track for production shipments in the second half of this year.” Even NVIDIA’s own timeline puts HBM4 volume production in H2 2026 — and that is the optimistic scenario.
CoWoS — the advanced packaging process that integrates HBM onto the GPU die — is a second binding constraint. TSMC’s CEO confirmed CoWoS capacity is “very tight and remains sold out through 2025 and into 2026.” Even if HBM became available tomorrow, packaging capacity is also committed.
Procurement timing reality:
- H200 ordered today: 6–12 month lead time
- Blackwell ordered today: 12–18+ months, no reliable delivery window before mid-2027
- Wait for HBM4 / Blackwell: realistic volume availability not before H2 2027 at the earliest
If you are holding off on buying because you want better hardware later, you are going to wait longer than you think. Samsung is raising HBM prices by high-teens to low-twenties percent in 2026 contracts — supply constraint is driving price increases on top of availability constraints.
For the full supply chain analysis, including memory bottlenecks and the buy-now-or-wait decision, see HBM4 delays and GDDR7 shortages — the memory bottlenecks squeezing GPU supply in 2026.
Should you buy GPU hardware now, or wait out the tariff and supply uncertainty?
The buy-versus-wait decision comes down to three things: how urgent is your AI workload need; how long is the realistic wait; and what is the risk that waiting makes things worse?
Buy now if: there is a live AI workload with measurable revenue or cost impact that cloud GPU cannot serve adequately; your organisation can fund the full upfront payment including the 25% tariff; and lead times of 6–12 months are acceptable.
Wait if: your primary workload can be served by cloud GPU for the next 12–18 months without RASA disruption risk; budget constraints make the upfront tariff cost difficult to absorb now; or you are inference-only and AMD MI325X availability is worth evaluating first.
Waiting is not a neutral position. Section 232 authority allows the President to adjust tariff rates without Congressional approval — tariffs could increase before domestic production provides any relief. Supply-chain experts see early signs of stabilisation around 2027 but caution that baseline pricing is unlikely to return to pre-2024 levels. And cloud GPU pricing is rising as hyperscalers pass through infrastructure cost increases.
TSMC Arizona Fab 21 Phase 2 volume production is not expected before H2 2027. Waiting for domestic production to drive costs down is not a 2026 procurement strategy.
About 31% of enterprise decision-makers are evaluating Google’s TPUs and 26% are evaluating AWS’s Trainium as alternatives. Not drop-in replacements for NVIDIA-based training workloads, but for inference-only use cases they may extend your options.
How do you model the owned-GPU vs cloud-GPU decision under the current regulatory environment?
The cloud-versus-own calculus has shifted since 2025. RASA has added a compliance variable to the cloud side that did not exist before. So the decision is not just about cost anymore.
You need four inputs: total cost of ownership for owned hardware including the 25% tariff; cloud GPU pricing for equivalent compute; RASA compliance exposure based on your operational geography; and workload flexibility requirements.
On-premises hosting becomes cheaper than cloud when GPU utilisation stays above 60–70% throughout the hardware’s lifespan. That break-even point is your starting point for the economic analysis.
For a $27,000 H200 with 25% tariff, the landed cost is $33,750. Amortised over four years at 60% average utilisation, the effective cost per GPU-hour increases by around 25% versus a pre-tariff model. Cloud H200 instances run at approximately $2.50/hr per GPU on one-month term contracts — 2026 pricing with supply constraints may be higher. Run the comparison against your actual utilisation profile.
Cloud GPU is the lower-risk choice if your company is wholly US-domiciled, uses only AWS, Azure, or GCP, has variable or unpredictable compute demand, and has compute needs below the threshold that justifies owned hardware amortisation.
Owned GPU is the better choice if you have predictable high-utilisation workloads above the 60–70% threshold, can absorb the full upfront tariff cost, and operate in geographies where RASA compliance risk is low.
For companies with operations in Indonesia, Singapore, or Australia: cloud GPU access via non-hyperscaler providers may face RASA restriction. Hyperscaler cloud (AWS Asia Pacific regions) is structurally lower RASA-compliance risk than regional providers, but may carry pricing premiums. Owned hardware avoids cloud access risk but may create separate import duty considerations under local tariff regimes.
The AI OVERWATCH Act adds a Blackwell-specific variable: if Congress passes a two-year ban on Blackwell-class GPU exports, cloud providers that have not taken delivery of Blackwell hardware may be unable to offer it — shifting the calculus toward buying current-generation hardware sooner.
What procurement hedges are available to SMB tech companies in 2026?
Given tariff, supply, and regulatory uncertainty, a single-channel procurement strategy carries concentrated risk. Here is a hedging framework that distributes that exposure.
Hedge 1 — Split owned/cloud allocation Acquire the minimum owned GPU capacity needed for predictable baseline workloads. Use reserved cloud GPU capacity for burst demand and growth. Practical step: define your baseline utilisation floor before placing any hardware order.
Hedge 2 — Multi-vendor diversification Procure both NVIDIA H200 and AMD MI325X where your workloads permit. AMD carries the same 25% tariff burden, but AMD’s supply chain is less hyperscaler-dominated, which may mean better spot availability for SMB buyers. Diversification also gives you negotiating leverage with NVIDIA. You will need a workload compatibility assessment first — MI325X is not a drop-in replacement for all training workloads.
Hedge 3 — Reserved cloud capacity commitments Negotiate 12–24 month reserved GPU instance contracts with AWS, Azure, or GCP now, before supply constraints tighten further or RASA compliance costs get passed through. Hyperscaler capex for 2026 is forecast at $527 billion — AWS, Azure, and GCP will maintain GPU supply even under constraint. Reserve pricing locks in a cost ceiling. Practical step: contact AWS enterprise sales to enquire about one-year reserved GPU instance pricing.
Hedge 4 — Vendor contract tariff-risk clauses Any hardware purchase contract should specify who bears the cost of tariff changes between order and delivery. Given 6–12 month lead times, tariff rates could change before your hardware arrives. Three provisions serve this function: fixed total-landed-cost pricing (seller bears tariff change risk); a material-adverse-change clause permitting cancellation at no penalty if tariffs increase by more than a defined threshold; and an export-licence contingency clause permitting cancellation and full refund if BIS or Congressional action makes delivery legally impermissible.
Hedge 5 — Staggered purchasing timeline Rather than one large procurement, stage purchases across two or three tranches to average out tariff exposure, supply timing, and technology-cycle risk. NVIDIA’s full upfront payment requirement is a negotiation point, not a fixed constraint — in limited cases, customers may substitute cash with commercial insurance or asset collateral.
These hedges are only achievable if your vendor contracts allow for them — which brings us to what those contracts should actually contain.
How do you renegotiate vendor contracts in a full-upfront-payment environment?
NVIDIA’s standard GPU procurement terms require 100% payment at order, with no cancellation provisions. That means if tariff rates increase between order and delivery, the buyer absorbs the difference. If export licence conditions change, the order may be undeliverable and payment may not be refundable. The AI OVERWATCH Act would terminate all existing export licences for covered chip categories if passed — which makes export-licence contingency clauses a practical contract issue, not a theoretical one.
Three provisions distribute that risk:
Fixed total-landed-cost pricing: The seller bears the tariff change risk, not the buyer. This shifts exposure for rate changes between order placement and delivery.
Material-adverse-change clause: Permits cancellation without penalty if tariff rates increase by more than a defined threshold. A threshold of 5 percentage points is a concrete and reasonable anchor to propose.
Export-licence contingency clause: Permits order cancellation and full refund if BIS or Congressional action makes delivery legally impermissible. BIS has already moved H200 and MI325X exports to China to case-by-case review — export licences are not guaranteed approvals. And a class-action complaint filed in March 2026 alleged an IT company misled investors by failing to disclose that server sales to Chinese companies violated US export controls — export compliance failures now carry securities fraud exposure.
AMD distributors are generally more flexible on payment terms than NVIDIA direct. For workloads that are AMD-compatible, that is a contractual advantage that complements the supply diversification argument.
White & Case notes the proclamation directs USTR to negotiate on tariff rates and warns that “depending on the outcome, President Trump may consider imposing higher tariffs.” Multi-month procurement contracts without tariff-change provisions carry rate-change exposure worth addressing at the contracting stage.
What does a 2026–2027 AI infrastructure planning framework look like in practice?
A practical planning framework turns the tariff, supply, regulatory, and contractual inputs into a decision sequence with defined action items and trigger conditions.
Phase 1 — Assess (now) Determine whether your AI workloads require owned GPU hardware or whether cloud GPU is sufficient for the next 18 months. Calculate the full tariff-inclusive landed cost for the hardware configuration you need, using the $27,000 reference price and the 25% tariff. Identify whether any cross-border operations create RASA compliance exposure. Reference cluster articles: Section 232 mechanics and BIS approval, HBM4 and GDDR7 shortage timeline, how the Remote Access Security Act changes cloud GPU strategy.
Phase 2 — Decide (within 30 days) Choose owned, cloud, or hybrid allocation. If owned, select NVIDIA H200 or AMD MI325X based on workload compatibility and current lead-time quotations. If cloud, identify which hyperscaler offers the configuration and regional coverage you need. Reference: hyperscaler tariff exemptions and the two-tier market, the 2M-unit China backlog and H200 availability.
Phase 3 — Contract (30–60 days) Secure vendor contracts with tariff-risk and export-licence contingency clauses. For cloud, lock in reserved pricing before Q4 2026 when supply-driven price increases are expected. For hardware, request confirmed delivery timelines in writing and include the material-adverse-change clause explicitly.
Phase 4 — Monitor (ongoing through 2027) Track BIS export licence policy developments. Monitor AI OVERWATCH Act legislative progress. Watch HBM4 production announcements from SK Hynix and Samsung as the trigger for realistic Blackwell procurement windows. Track TSMC Arizona Phase 2 volume production announcements for the domestic supply timeline.
Trigger conditions that should cause you to reassess your plan:
- Tariff rate change (up or down)
- AI OVERWATCH Act passage in the Senate
- HBM4 volume production confirmation from SK Hynix or Samsung
- TSMC Arizona Fab 21 Phase 2 first advanced-node production announcement
- Any BIS policy change affecting H200 or MI325X domestic use cases
Supply-chain experts see early signs of stabilisation around 2027, but baseline pricing is unlikely to return to pre-2024 levels. Plan your 2026–2027 AI infrastructure on the assumption that constraints persist — and build trigger conditions into your procurement plan so you can move quickly when the picture shifts.
Frequently asked questions
Does the 25% AI chip tariff apply if I am renting GPUs from AWS, Azure, or Google Cloud?
No. The import tariff is applied at the US border when hardware is first imported. Cloud customers do not import hardware. Hyperscalers benefit from the TSMC fab-capacity multiplier — they can import GPUs at 2.5× their committed Arizona fab capacity without paying the tariff. Some pricing adjustment may occur as supply-chain costs work through the market, but there is no tariff line item on a cloud bill. RASA is a separate compliance consideration for cloud customers with cross-border operations, and is distinct from the import duty.
Does our company need to do anything about RASA compliance?
If you are a US-domiciled company using AWS, Azure, or GCP for domestic operations, no direct action is required. RASA has passed the US House but is not yet enacted — compliance obligations will sit with the cloud provider, not you. If you have operations in Indonesia, Singapore, or other export-control-adjacent jurisdictions, a legal review of your cloud GPU access arrangements is worth doing before RASA is enacted. Companies providing AI services to customers in export-restricted countries should review their service delivery architecture with legal counsel now — BIS is already applying existing EAR controls to cloud access arrangements.
What is the tariff exemption that hyperscalers get and how do I know if my company qualifies?
The exemption comes from committing capital to TSMC Arizona Fab 21 construction. Qualifying organisations can import GPUs at 2.5× their invested capacity during construction, 1.5× once operational, without paying the 25% tariff. The investment threshold required to make this meaningful is in the billions. There is no SMB-accessible equivalent. Assume the full 25% applies.
Should I be worried about the AI OVERWATCH Act affecting my GPU order?
The AI OVERWATCH Act passed out of the House Foreign Affairs Committee and has a bipartisan Senate companion bill, but still needs to pass both chambers and receive a presidential signature. If passed, it would impose a two-year ban on Blackwell-class GPU exports to China. For a US company buying hardware for domestic use, the main risk is contract disruption if your supplier cannot fulfil due to export constraints on their distribution chain. Include export-licence contingency clauses in all hardware contracts as a precaution.
When will GPU supply constraints ease?
HBM3e contracted allocations run through end of 2026; spot availability may appear in Q1 2027 as Samsung and Micron ramp production. HBM4 / Blackwell realistic volume availability is not before mid-to-late 2027. TSMC Arizona domestic production is expected to begin advanced-node volume production no earlier than H2 2027, with initial allocation priority to anchor investors. Planning assumption: H200-class hardware supply eases modestly in mid-2027; Blackwell availability remains tight through 2027.
Is AMD MI325X a viable alternative to NVIDIA H200 under the current tariff regime?
AMD MI325X carries the same 25% tariff treatment — there is no tariff advantage to choosing AMD. AMD may offer shorter lead times depending on distributor relationships. MI325X is competitive with H200 for inference workloads but is not a drop-in replacement for all training workloads. AMD’s supply chain is less hyperscaler-dominated, which may mean better spot availability for SMB buyers in 2026. A workload compatibility assessment is required before substituting AMD for NVIDIA.
How do I assess whether our cloud GPU provider is RASA-compliant?
AWS, Azure, and GCP have all indicated they are implementing RASA compliance processes. Using these three hyperscalers is structurally lower risk than non-hyperscaler or regional providers. Ask providers directly for their RASA compliance statement and confirm GPU availability in your required regions. For companies operating in or serving customers in Southeast Asia, request specific confirmation that GPU access from Australian, Singaporean, or Indonesian endpoints remains available.
What does “full upfront payment” mean in practice for GPU procurement?
NVIDIA’s current standard terms require 100% payment at order placement, not delivery. The full tariff cost must be funded at order time — $54,000 on an 8-GPU H200 cluster at the $27,000 reference price. The buyer bears full financial exposure if export licences are denied or tariffs change between order and delivery. Contract contingency clauses for these scenarios are not optional extras in the current environment.
What is the BIS four-criteria approval process and does it affect my company?
BIS (Bureau of Industry and Security, US Commerce Department) reviews export licence applications against four criteria: end-user, end-use, country of destination, and risk of diversion to restricted parties. For a US company buying hardware for domestic use, BIS review is not directly applicable. For companies with cross-border operations, shipment of GPU hardware to certain jurisdictions (China, Russia, others) requires BIS export licence approval and is subject to case-by-case denial. The AI OVERWATCH Act would add Congressional review as an additional approval gate for certain export categories.
What tariff exemptions exist for smaller tech companies?
No formal exemption mechanism is available outside the TSMC Arizona fab-investment framework. The Section 232 authority allows the President to grant product-specific or country-specific exclusions, but no SMB exemption has been announced. Monitor BIS and USTR notices for any product exclusion requests — these are announced via the Federal Register and worth tracking if you have procurement decisions coming up in 2027.
How does the 25% tariff affect the amortised cost per GPU-hour for owned vs cloud infrastructure?
The tariff is a one-time sunk cost that increases the capital cost basis amortised over the hardware’s useful life. For a $27,000 H200 with 25% tariff, the landed cost is $33,750. Amortised over four years at 60% average utilisation — roughly 21,024 GPU-hours — the per-GPU-hour cost increases proportionally: 25% more capital, 25% more cost per hour. Compare that tariff-adjusted amortisation against current reserved cloud instance pricing to determine which is lower-cost for your actual utilisation profile.