Nvidia H20

Nvidia H20

Export-compliant Hopper inference GPU with lower compute than H100.

Compare vs other GPUs →

Key Specifications

Architecture
Hopper
Memory
96GB HBM3
Memory Bandwidth
4,000 GB/s
Release date
Early 2024

We couldn't find any available H20 GPUs. Search for alternative GPUs.

Frequently Asked Questions

Why choose the H20?

96GB HBM3 at lower cost than H100. Large memory capacity suits inference for 70B models. Available in China where H100/H200 are restricted.

When is the H20 not a good fit?

Significantly reduced compute compared to H100. PCIe-only with no NVLink support. Better suited for memory-bound inference than training or latency-sensitive workloads.

What size AI models can the H20 run?

With 96GB of VRAM, the H20 is well suited to 30B-class models in FP16, and 70B-class models in 4-bit or 8-bit quantized form.

How much VRAM does the H20 have?

The H20 has 96GB of VRAM. Multi-GPU setups increase total memory, but that memory is not automatically pooled across GPUs.

What is the H20's memory bandwidth?

The H20 has 4,000 GB/s of memory bandwidth. Higher bandwidth helps with faster data transfer between GPU memory and compute cores.

What data types does the H20 support?

The H20 supports 7 precision formats. Training: BF16, FP16, TF32, FP32. Inference: FP8, INT8. Scientific: FP64.

Does the H20 support NVLink?

No. The H20 is a PCIe-only GPU with no NVLink, so it is better suited to single-GPU inference and smaller-scale workloads than large distributed training jobs.

Technical Specifications

Architecture NVIDIA Hopper
CUDA Cores 14,592
Tensor Cores 4th Generation
GPU Memory 96GB HBM3
Memory Bandwidth 4.0 TB/s
L2 Cache 60 MB
FP64 1 TFLOPS
FP32 44 TFLOPS
TF32 Tensor Core 74 TFLOPS
FP16 Tensor Core 148 TFLOPS
INT8 Tensor Core 296 TOPS
System Interface PCIe Gen5 x16
NVLink 900 GB/s
Max Thermal Design Power (TDP) 350W
Multi-Instance GPUs Up to 7 MIGs
Form Factor SXM / PCIe

Alternatives to Nvidia H20

Last updated