Business

SaaS

Technology

•

Dec 4, 2025

Computational Lithography Achieving Twenty Times Faster Optical Proximity Correction with GANs

This article is part of our comprehensive exploration of the broader AI manufacturing revolution, examining how artificial intelligence is recursively building the infrastructure for its own expansion.

Samsung and NVIDIA just announced they’ve made optical proximity correction 20 times faster using generative adversarial networks. If you’re not deep in semiconductor manufacturing, that probably sounds like alphabet soup. But here’s why you should care: this breakthrough takes a weeks-long bottleneck in chip design and turns it into something you can knock out in hours.

Optical proximity correction—OPC for short—is one of those thankless computational headaches that sits between “we designed this chip” and “we can actually manufacture this chip.” It’s all about pre-distorting photomask patterns so that when light diffraction inevitably messes with them during manufacturing, you actually end up with what you wanted. Think of it as the semiconductor version of aiming left when you know the wind will push everything right.

Traditional OPC methods hit a wall at advanced nodes. When you’re trying to print features smaller than the wavelength of light you’re using, physics gets really difficult really fast. The computational complexity explodes. What used to take hours now takes days. And at 3nm and 2nm processes? You’re looking at physics simulations churning through hundreds of iterations per feature across billions of features per chip.

This is where GANs come in. As part of Samsung’s AI megafactory context, NVIDIA GPUs were deployed throughout their chip manufacturing process, using neural networks to skip the whole iterative simulation dance. Instead of calculating physics hundreds of times, a trained model just predicts the right mask pattern in one pass. Full-chip OPC that used to take 48+ hours? Now it finishes in under 2.5 hours.

What does faster OPC actually mean for you? It means catching yield issues months earlier. It means getting products to market when market windows actually matter. In competitive markets like smartphone processors, that’s worth tens to hundreds of millions.

What is optical proximity correction in semiconductor manufacturing?

When you shine light through a photomask to pattern silicon, physics messes with what you intended. Diffraction bends light around corners. Photoresist chemistry doesn’t respond uniformly. Lens aberrations blur edges. What you designed isn’t what actually prints on the wafer.

OPC compensates for all this by tweaking mask features before you manufacture anything. It adjusts feature sizes, nudges edge positions, and adds sub-resolution assist features—little helpers that make the main pattern print correctly but don’t show up in the final result themselves. The goal is simple: make the actual printed pattern match your design intent. Success is measured by edge placement error, which is just the deviation between where edges should be and where they actually end up.

But here’s the problem. Modern chips have billions of features. Every single one needs optimisation. Every single one interacts with its neighbours through optical effects. And every single one matters because you’re working at dimensions smaller than the light wavelength you’re using—193nm deep ultraviolet or 13.5nm extreme ultraviolet trying to print 3nm features.

Traditional OPC tackles this with physics-based simulation. Calculate how light propagates. Simulate photoresist chemistry. Iterate until edge placement error drops below threshold. Repeat billions of times. It works, but it’s painfully slow, and it gets slower every generation as feature sizes shrink and complexity grows.

How do generative adversarial networks accelerate optical proximity correction?

GANs replace iterative physics simulation with pattern recognition. Simple as that. A generator network learns to create mask corrections. A discriminator network evaluates whether the results meet lithography specs. Train these networks on millions of design-mask pairs from actual production runs, and you end up with a model that generates high-quality OPC solutions without ever touching a physics simulator.

The speed-up comes from killing the iteration loop. Traditional model-based OPC might simulate 100-500 iterations per feature, running physics calculations mostly sequentially on CPUs. GAN-OPC figures out the answer in one pass, leveraging GPU parallelism to process thousands of features at the same time.

The networks incorporate lithography constraints—what researchers like to call physics-informed ML. Virtual process models trained with fab data and AI/ML give you the speed of neural networks combined with the rigour of physics simulation.

Now there is a trade-off here, and it’s accuracy. GAN-OPC runs 20-100x faster than full inverse lithography technology but with slightly lower precision. For most chip layers—metal interconnects, vias, that sort of thing—that’s perfectly acceptable. For the handful of absolutely critical layers where every nanometre counts, you still use the slower physics-based approaches. Smart fabs use hybrid workflows, applying the right tool to each layer based on how tight the tolerances are.

Samsung’s implementation combines NVIDIA’s cuLitho GPU-accelerated library with GAN models, creating production-ready OPC that ships actual products rather than just looking impressive in research papers.

Why did Samsung and NVIDIA achieve a 20x performance improvement?

Three things came together at once: GPU hardware parallelism, optimised software, and neural network architecture that just skips the expensive bits.

Start with the baseline. Traditional CPU-based model-based OPC on tools like Siemens Calibre processes features mostly one after another. Even with parallelisation, you’re fundamentally bottlenecked by the iterative nature of physics simulation—calculate, evaluate, adjust, repeat.

GPUs can process 10,000+ features at the same time. cuLitho optimises computational lithography algorithms specifically for NVIDIA GPUs, squeezing maximum performance out of the silicon. But the real breakthrough is the GAN architecture eliminating iteration entirely. Traditional OPC simulates optical diffraction hundreds of times per feature, each iteration refining the mask pattern just a bit. GAN-OPC looks at a design pattern and spits out the correction in one shot. One inference pass versus hundreds of simulation loops.

Samsung deployed this in actual fab operations, not just benchmarks. Full-chip OPC completing in 2.4 hours versus 48+ hours means design teams get feedback the same day instead of waiting a week. That enables daily engineering change orders instead of weekly cycles. Over a product development timeline measured in months, that’s the difference between catching problems early versus discovering them way too late.

The scalability matters too. At 3nm and 2nm process nodes, OPC complexity increases roughly 10x compared to 7nm. Without this speed-up, traditional approaches would become completely infeasible for quick turnaround.

What is the difference between GAN-OPC and inverse lithography technology?

Both optimise photomasks computationally, but they sit at completely different points on the speed-accuracy trade-off curve.

Inverse lithography technology works backwards from desired wafer patterns to ideal mask shapes using rigorous physics-based optimisation. It achieves the highest accuracy but needs 10-100x more computation than traditional OPC.

The accuracy hierarchy goes like this: ILT beats traditional model-based OPC beats GAN-OPC beats rule-based OPC. The computational cost hierarchy? ILT slowest, then model-based OPC, then GAN-OPC fastest for actual production use.

Use cases differ by how critical the layer is. SRAM cells, logic gates, anything with sub-10nm critical dimensions—that’s ILT territory where you need maximum precision. Features in the 10-50nm range, metal interconnects, vias—GAN-OPC delivers acceptable accuracy 20x faster.

Production fabs run hybrid workflows. Out of 60+ mask layers per chip, maybe 5-10 layers use ILT. The rest use GAN-OPC. This balances quality against throughput and economics. ILT mask costs run $500K-1M with weeks of turnaround. GAN-OPC enables faster iteration cycles without blowing through your mask budget.

How does EUV lithography increase optical proximity correction complexity?

Extreme ultraviolet lithography uses way fewer photons per feature than deep ultraviolet, creating random shot-noise variations that OPC has to account for statistically rather than deterministically.

DUV at 193nm wavelength uses relatively heaps of photons. EUV at 13.5nm wavelength uses 50-100x fewer photons for the same feature size. Fewer photons means more randomness. This stochastic variability shows up as pattern variations that traditional deterministic OPC just can’t model properly.

EUV masks add yet another layer of complexity. Unlike transmissive DUV masks, EUV uses reflective masks with multilayer coatings. These create shadowing effects that change with illumination angle, requiring way more sophisticated three-dimensional physics models. Computational effort increases 3-5x compared to DUV OPC at equivalent feature sizes.

The 20x speed-up becomes necessary rather than nice-to-have at these advanced nodes. At 5nm and 3nm nodes using EUV extensively, traditional OPC simply can’t deliver quick enough turnaround for practical product development. GAN models trained on production data implicitly learn stochastic behaviour patterns without having to explicitly model every single random photon interaction.

What infrastructure is required for GPU-accelerated computational lithography?

You need NVIDIA A100 or H100 data centre GPUs. We’re talking 8-16 GPUs per OPC workstation for full-chip processing. H100 provides 80GB HBM2e memory starting around $1.75/hour in cloud setups, though production deployments typically stick with on-premises infrastructure for IP security reasons.

Multi-GPU scaling needs high-bandwidth interconnects—NVLink for GPU-to-GPU chat. System memory requirements are substantial: 500GB-2TB for handling large design databases. Storage needs NVMe SSD arrays, 10-50TB capacity for process development workloads.

Your software stack includes the cuLitho library, CUDA toolkit, and integration with existing electronic design automation tools like Calibre or Synopsys.

Power and cooling matter here. Expect 3-7kW per system, which means you need proper data centre infrastructure.

Capital investment runs $200K-500K per GPU workstation versus $50K-100K for traditional CPU-based OPC systems. On the surface that looks expensive, but faster iterations reduce time-to-market by 2-6 months, and that’s worth millions in revenue for competitive products.

Cloud alternatives exist—AWS and Azure both offer GPU instances—but IP security concerns limit adoption for sensitive chip designs. Most production OPC stays behind the company firewall.

What are the business implications of 20x faster OPC for chip design teams?

Twenty-times faster OPC completely transforms design economics by enabling daily instead of weekly iterations. As explored in the AI megafactory revolution, these computational breakthroughs are reshaping semiconductor manufacturing timelines and economics. This chops time-to-market by 2-4 months and slashes mask revision costs by 30-50%.

Your design teams can explore way more architecture variations in the same timeframe. Instead of betting everything on one approach and hoping it works out, you can try 5-10 alternatives and pick the best one.

Faster feedback cycles compress the learning curve. Engineering change orders that used to take a week now turn around same-day. Catching yield issues earlier in development shaves 3-6 months off your production ramp time.

Mask cost reduction comes from better upfront optimisation. Fewer respins because you caught problems in simulation rather than discovering them after you’ve already made silicon. That’s $500K-2M saved per tapeout.

The competitive advantage is all about timing. In markets where being first matters—smartphone processors, data centre chips, AI accelerators—launching 2-4 months earlier lets you capture premium pricing. That’s $10M-100M+ in additional revenue depending on your market window.

For fabless companies relying on foundries, faster OPC means quicker feedback loops with your manufacturing partner. For foundries like Samsung making this capability public, it’s a way to attract fabless customers who need the fastest possible turnaround times.

The organisational impact? Design teams stop being bottlenecked by OPC wait time. That shifts the focus from “waiting for results” to “what should we try next.”

FAQ Section

What computational resources did Samsung use to achieve the 20x speedup?

Samsung deployed NVIDIA GPUs throughout their chip manufacturing process running the cuLitho computational lithography library, processing full-chip designs with 8-16 GPUs per workstation. The system combines GPU parallelism with GAN-based mask optimisation models trained on Samsung’s own proprietary process data. Industry estimates suggest you’d need 32-128 GPUs for production deployment across multiple projects running at the same time.

Can smaller semiconductor companies afford GPU-accelerated OPC infrastructure?

Initial capital investment of $200K-500K per GPU workstation creates pretty serious barriers for startups, but there are alternatives. Cloud-based GPU access through AWS and Azure makes the technology more accessible—H100 instances start around $1.75/hour. Fabless companies typically rely on their foundry partners for OPC rather than trying to maintain in-house infrastructure. Design houses with 50-500 employees might be able to justify one GPU system for critical projects, spreading the costs across multiple tapeouts.

How accurate is GAN-OPC compared to traditional physics-based methods?

GAN-OPC achieves 95-98% of traditional model-based OPC accuracy while running 20x faster. That’s good enough for non-critical layers. Edge placement error typically increases by 0.5-1.5nm compared to rigorous physics simulation—perfectly acceptable for metal interconnects and vias. Critical layers like gates and SRAM cells still use slower ILT or model-based OPC for maximum precision.

Does GAN-OPC work for all semiconductor process nodes?

GAN-OPC provides the most value at advanced nodes—7nm and below—where traditional OPC becomes computationally infeasible. Mature nodes like 28nm and 40nm use simpler rule-based OPC that’s already plenty fast. The sweet spot is 3nm-7nm processes where complexity demands sophisticated OPC but production volumes justify the infrastructure investment.

What training data is required for GAN-OPC models?

GAN-OPC training needs millions of design-mask pairs from actual production or calibrated simulations, representing 3-12 months of data collection. Samsung trains on proprietary process data, which creates a real competitive advantage. Wafer manufacturing processes provide valuable data for building virtual twins that can predict future behaviour. Foundries typically provide pre-trained GAN models to fabless customers as part of their process design kits.

How long does it take to implement GAN-OPC in production workflows?

Integration timeline spans 6-18 months. That includes GPU infrastructure deployment (2-3 months), cuLitho software integration with your existing EDA tools (3-6 months), GAN model training and validation (4-8 months), and production qualification (2-4 months). Early adopters face steeper learning curves. Followers benefit from mature workflows and foundry support, which cuts the timeline down to 6-12 months.

What are the risks of switching from traditional OPC to GAN-based methods?

Key risks include model accuracy variation across different design styles, dependency on training data quality, potential for systematic errors if your GANs learn incorrect patterns, and tool maturity concerns compared to 20-year-old Calibre platforms. Ways to manage these risks: use hybrid workflows with GANs for speed plus physics verification for critical layers, extensive validation against gold-standard simulations, and gradual rollout starting with non-critical products.

How does GAN-OPC affect mask manufacturing and inspection?

GAN-OPC generates masks that work with standard manufacturing equipment—variable shaped beam or multi-beam writers—so you don’t need to change your fab infrastructure. Mask inspection may pick up different error patterns than traditional OPC, which means you’ll need updated inspection recipes. Some mask shops report 10-20% reduction in write time because GAN-generated patterns are simpler.

Can GAN-OPC models transfer between different semiconductor foundries?

GAN models are process-specific and generally won’t transfer between foundries because of proprietary equipment, materials, and process parameters. A model trained on Samsung 3nm won’t work for TSMC 3nm without retraining. However, transfer learning techniques allow faster adaptation, reducing training data requirements by 50-70% when you’re switching between similar processes.

What is NVIDIA cuLitho and how does it differ from traditional EDA tools?

NVIDIA cuLitho is a GPU-accelerated computational lithography library that provides OPC, ILT, and mask verification functions optimised for NVIDIA GPUs. Unlike CPU-based tools like Siemens Calibre or Synopsys Proteus, cuLitho leverages thousands of GPU cores for massive parallelisation. It integrates with your existing EDA workflows rather than replacing them—think of it as an acceleration layer. Samsung plans to develop GPU-accelerated electronic design automation tools using cuLitho as the foundation.

How does physics-informed machine learning improve upon pure data-driven GANs?

Physics-informed ML incorporates lithography equations—optical diffraction, photoresist chemistry, all that stuff—directly into neural network architectures. This constrains predictions to physically plausible solutions. The benefits? It reduces training data requirements by 30-50%, improves generalisation to novel design patterns, and prevents non-physical solutions from sneaking through. Virtual process models trained with fab data and AI/ML recommend optimisations that are actually grounded in real physics. Samsung’s implementation uses physics-informed discriminators that verify GAN outputs against Maxwell’s equations.

What competitive advantages does GAN-OPC provide to semiconductor manufacturers?

Manufacturers with mature GAN-OPC deployment achieve 2-4 month time-to-market advantages. That enables first-mover pricing power worth hundreds of millions in competitive markets like smartphone processors. Faster iterations improve yield learning curves, which cuts production costs by 5-15%. Technology leadership also signals to the market and attracts premium customers willing to pay for cutting-edge process nodes. Samsung’s public announcement is all about attracting fabless customers who need the fastest turnaround times available.

Conclusion: From Breakthrough to Implementation

The 20x performance improvement in optical proximity correction represents more than a computational achievement—it fundamentally changes how chip design teams operate. By collapsing multi-day OPC cycles into hours, GAN-based approaches enable rapid iteration that was simply impossible with traditional physics simulation.

For organisations considering applying computational lithography techniques, the key question isn’t whether to adopt GPU-accelerated OPC, but when and how. Fabless companies should evaluate foundry OPC capabilities as part of vendor selection criteria. Design houses need to weigh capital investment in GPU infrastructure against potential time-to-market advantages. Product teams should factor faster iteration cycles into development timelines.

The implementation considerations for OPC extend beyond technical specifications to organisational readiness, vendor partnerships, and workflow integration. Success requires alignment between design teams, manufacturing partners, and technology platforms.

Samsung and NVIDIA’s breakthrough demonstrates that the computational bottlenecks in advanced semiconductor manufacturing aren’t insurmountable—they’re engineering challenges waiting for the right combination of hardware, software, and machine learning architecture. As the industry pushes toward 2nm and beyond, GPU-accelerated computational lithography moves from competitive advantage to table stakes.