96GB HBM3 at lower cost than H100 . Large memory capacity suits inference for 70B models. Available in China where H100 / H200 are restricted.

When is the H20 not a good fit?

Significantly reduced compute compared to H100 . PCIe-only with no NVLink support. Better suited for memory-bound inference than training or latency-sensitive workloads.

What size AI models can the H20 run?

With 96GB of VRAM, the H20 is well suited to 30B-class models in FP16, and 70B-class models in 4-bit or 8-bit quantized form.

How much VRAM does the H20 have?

The H20 has 96GB of VRAM. Multi-GPU setups increase total memory, but that memory is not automatically pooled across GPUs.

What is the H20's memory bandwidth?

The H20 has 4,000 GB/s of memory bandwidth. Higher bandwidth helps with faster data transfer between GPU memory and compute cores.

What data types does the H20 support?

The H20 supports 7 precision formats. Training: BF16, FP16, TF32, FP32. Inference: FP8, INT8. Scientific: FP64.

Does the H20 support NVLink?

No. The H20 is a PCIe-only GPU with no NVLink, so it is better suited to single-GPU inference and smaller-scale workloads than large distributed training jobs.

Nvidia H20

Export-compliant Hopper inference GPU with lower compute than H100.

Compare vs other GPUs →

Key Specifications

Architecture

Hopper

Memory

96GB HBM3

Memory Bandwidth

4,000 GB/s

Release date

Early 2024

We couldn't find any available H20 GPUs. Search for alternative GPUs.

Frequently Asked Questions

Why choose the H20?: 96GB HBM3 at lower cost than H100. Large memory capacity suits inference for 70B models. Available in China where H100/H200 are restricted.
When is the H20 not a good fit?: Significantly reduced compute compared to H100. PCIe-only with no NVLink support. Better suited for memory-bound inference than training or latency-sensitive workloads.
What size AI models can the H20 run?: With 96GB of VRAM, the H20 is well suited to 30B-class models in FP16, and 70B-class models in 4-bit or 8-bit quantized form.
How much VRAM does the H20 have?: The H20 has 96GB of VRAM. Multi-GPU setups increase total memory, but that memory is not automatically pooled across GPUs.
What is the H20's memory bandwidth?: The H20 has 4,000 GB/s of memory bandwidth. Higher bandwidth helps with faster data transfer between GPU memory and compute cores.
What data types does the H20 support?: The H20 supports 7 precision formats. Training: BF16, FP16, TF32, FP32. Inference: FP8, INT8. Scientific: FP64.
Does the H20 support NVLink?: No. The H20 is a PCIe-only GPU with no NVLink, so it is better suited to single-GPU inference and smaller-scale workloads than large distributed training jobs.

Technical Specifications


Architecture	NVIDIA Hopper
CUDA Cores	14,592
Tensor Cores	4th Generation
GPU Memory	96GB HBM3
Memory Bandwidth	4.0 TB/s
L2 Cache	60 MB
FP64	1 TFLOPS
FP32	44 TFLOPS
TF32 Tensor Core	74 TFLOPS
FP16 Tensor Core	148 TFLOPS
INT8 Tensor Core	296 TOPS
System Interface	PCIe Gen5 x16
NVLink	900 GB/s
Max Thermal Design Power (TDP)	350W
Multi-Instance GPUs	Up to 7 MIGs
Form Factor	SXM / PCIe

Alternatives to Nvidia H20

Nvidia H100

Full Hopper with higher compute and NVLink. From $0.35/hr per GPU across 49 providers.

Nvidia L40S

Ada Lovelace inference GPU at lower cost. From $0.48/hr per GPU across 29 providers.

Nvidia A100

Previous-gen with higher compute and NVLink. From $0.13/hr per GPU across 39 providers.

Last updated March 25, 2026

Thunder Compute

Our sponsor

NVIDIA H100 from $1.38/hr, A100 80GB from $0.78/hr, RTX A6000 from $0.27/hr

No commitments, billed per minute, SOC 2 and GDPR compliant

Hot swappable hardware, templates for ComfyUI and Ollama

One-click connection to VS Code or Cursor, simple CLI to connect without SSH

Try now

Key Specifications

Frequently Asked Questions

Why choose the H20?

When is the H20 not a good fit?

What size AI models can the H20 run?

How much VRAM does the H20 have?

What is the H20's memory bandwidth?

What data types does the H20 support?

Does the H20 support NVLink?

Technical Specifications

Alternatives to Nvidia H20

Nvidia H100

Nvidia L40S

Nvidia A100