4x 16GB GPUs on a single card (64GB total). Designed for virtual desktop (VDI) and multi-user GPU sharing. Hardware video encode/decode support.

When is the A16 not a good fit?

Built for VDI and multi-user GPU sharing. Individual GPU dies have limited compute, so better suited for remote desktops and lightweight graphics than AI workloads.

Are A16 prices going up or down?

On-demand pricing has decreased by about 9% since July 2025, dropping from $0.58 to $0.52/hr per GPU.

What size AI models can the A16 run?

With 16GB of VRAM, the A16 is best for 7B-class models in 4-bit or 8-bit quantized form, and smaller models in FP16.

How much VRAM does the A16 have?

The A16 has 16GB of VRAM. Multi-GPU setups increase total memory, but that memory is not automatically pooled across GPUs.

What is the A16's memory bandwidth?

The A16 has 200 GB/s of memory bandwidth. Higher bandwidth helps with faster data transfer between GPU memory and compute cores.

What data types does the A16 support?

The A16 supports 5 precision formats. Training: BF16, FP16, TF32, FP32. Inference: INT8.

Does the A16 support NVLink?

No. The A16 is a PCIe-only GPU with no NVLink, so it is better suited to single-GPU inference and smaller-scale workloads than large distributed training jobs.

How much does the A16 cost per hour?

A16 pricing currently ranges from $0.47/hr to $0.56/hr per GPU, depending on the provider, instance type, and billing model.

How much does the A16 cost per month?

At 720 hours per month, one A16 can cost between $339.03 to $405.90 per month, depending on the provider. Reserved and spot pricing can lower that further.

Which cloud providers offer the A16?

The A16 is available from 3 cloud providers: Vultr, Sesterce, Runcrate.

Can I rent the A16 in the cloud?

Yes. We currently track 10 A16 listings across 3 cloud providers: Billing type Listings Avg $/GPU/hr On-demand 10 $0.52/hr

Nvidia A16

Purpose-built for virtual desktop (VDI) deployments.

Compare vs other GPUs →

Aggregating historical prices...

Key Specifications

Architecture

Ampere

Memory

4x 16GB GDDR6

Memory Bandwidth

4x 200 GB/s

Release date

Q2 2021

Compare Cloud Provider Prices

We don't see much volatility for the A16 right now. Most providers are clustered between $0.47 and $0.56/hr per GPU, so availability is likely the deciding factor.

Billing

Max price/hr

Min GPUs

Min vCPUs

Min RAM (GB)

of shown

Default ranking

Our algorithm weighs five factors to find the most relevant matches for you:

Price: We blend total hourly price with price-per-GPU to balance affordability and value.
Specs: We favor offers with higher CPU, RAM, and GPU Memory.
Billing: We favor on-demand billing for simplicity and flexibility over spot instances, reservations, and custom quotes.
Location: We blend both datacenter proximity and provider HQ location. Datacenter location matters for latency, while HQ location matters for compliance and support.
Provider diversity: Each time a provider appears in the list, their subsequent offerings are ranked a little lower, so one provider's offerings don't crowd out the top positions.

Sorting and filtering

Click any column header to sort by that column. Use the filters above the table to narrow results by billing type, GPU count, vCPUs, or RAM. Custom sorting resets the default relevance ranking.

Transparency and funding

Ads and sponsors: Any paid placements are fixed at the top and clearly labeled as sponsored content.

Affiliates: Any affiliate links will be indicated to you as well. We may earn a commission if you click them, but this never influences the ranking order.

Provider	GPUs	Total VRAM	vCPUs	RAM	Billing	$/GPU/h	Total/h	Availability
Vultr	1x A16 1x A16 16GB (vcg-a16-6c-64g-16vram)	16GB	6	64GB	On-Demand Pay-as-you-go pricing. No term commitments.	$0.47	$0.47	Available Last checked <15m ago	View
Vultr	2x A16 2x A16 16GB (vcg-a16-12c-128g-32vram)	32GB	12	128GB	On-Demand Pay-as-you-go pricing. No term commitments.	$0.47	$0.94	Available Last checked <15m ago	View
Sesterce	1x A16 1x A16 16GB	16GB	6	64GB	On-Demand Pay-as-you-go pricing. No term commitments.	$0.56	$0.56	Available Last checked <15m ago	View
Vultr	4x A16 4x A16 16GB (vcg-a16-24c-256g-64vram)	64GB	24	256GB	On-Demand Pay-as-you-go pricing. No term commitments.	$0.47	$1.88	Available Last checked <15m ago	View
Sesterce	2x A16 2x A16 16GB	32GB	12	128GB	On-Demand Pay-as-you-go pricing. No term commitments.	$0.56	$1.12	Available Last checked <15m ago	View
Vultr	8x A16 8x A16 16GB (vcg-a16-48c-496g-128vram)	128GB	48	496GB	On-Demand Pay-as-you-go pricing. No term commitments.	$0.47	$3.77	Available Last checked <15m ago	View
Runcrate	1x A16 PCIe 1x A16 16GB PCIe	16GB	6	64GB	On-Demand Pay-as-you-go pricing. No term commitments.	$0.56	$0.56	Available Last checked <15m ago	View
Sesterce	4x A16 4x A16 16GB	64GB	24	256GB	On-Demand Pay-as-you-go pricing. No term commitments.	$0.56	$2.25	Available Last checked <15m ago	View
Sesterce	8x A16 8x A16 16GB	128GB	48	496GB	On-Demand Pay-as-you-go pricing. No term commitments.	$0.56	$4.50	Available Last checked <15m ago	View
Vultr	16x A16 16x A16 16GB (vcg-a16-96c-960g-256vram)	256GB	96	960GB	On-Demand Pay-as-you-go pricing. No term commitments.	$0.47	$7.53	Low stock Last checked <15m ago	View
No offerings matching your filters.

Heads up: We do our best to keep these specs & prices accurate. However, cloud costs may fluctuate based on region, usage, and other factors not listed here. These are estimates based on common setups and are for informational purposes only. Always verify current rates & exact specs with the provider before provisioning.

Frequently Asked Questions

Why choose the A16?: 4x 16GB GPUs on a single card (64GB total). Designed for virtual desktop (VDI) and multi-user GPU sharing. Hardware video encode/decode support.
When is the A16 not a good fit?: Built for VDI and multi-user GPU sharing. Individual GPU dies have limited compute, so better suited for remote desktops and lightweight graphics than AI workloads.
Are A16 prices going up or down?: On-demand pricing has decreased by about 9% since July 2025, dropping from $0.58 to $0.52/hr per GPU.
What size AI models can the A16 run?: With 16GB of VRAM, the A16 is best for 7B-class models in 4-bit or 8-bit quantized form, and smaller models in FP16.
How much VRAM does the A16 have?: The A16 has 16GB of VRAM. Multi-GPU setups increase total memory, but that memory is not automatically pooled across GPUs.
What is the A16's memory bandwidth?: The A16 has 200 GB/s of memory bandwidth. Higher bandwidth helps with faster data transfer between GPU memory and compute cores.
What data types does the A16 support?: The A16 supports 5 precision formats. Training: BF16, FP16, TF32, FP32. Inference: INT8.
Does the A16 support NVLink?: No. The A16 is a PCIe-only GPU with no NVLink, so it is better suited to single-GPU inference and smaller-scale workloads than large distributed training jobs.
How much does the A16 cost per hour?: A16 pricing currently ranges from $0.47/hr to $0.56/hr per GPU, depending on the provider, instance type, and billing model.
How much does the A16 cost per month?: At 720 hours per month, one A16 can cost between $339.03 to $405.90 per month, depending on the provider. Reserved and spot pricing can lower that further.
Which cloud providers offer the A16?: The A16 is available from 3 cloud providers: Vultr, Sesterce, Runcrate.
Can I rent the A16 in the cloud?: Yes. We currently track 10 A16 listings across 3 cloud providers:

Billing type Listings Avg $/GPU/hr

On-demand 10 $0.52/hr

Billing type	Listings	Avg $/GPU/hr
On-demand	10	$0.52/hr

Technical Specifications


GPU Architecture	NVIDIA Ampere architecture
GPU Memory	4x 16 GB GDDR6
Memory Bandwidth	4x 200 GB/s
Error-Correcting Code (ECC)	Yes
NVIDIA Ampere architecture-based CUDA Cores	4x 1280
NVIDIA Third-Generation Tensor Cores	4x 40
NVIDIA Second-Generation RT Cores	4x 10
FP32 \| TF32 \| TF32' (TFLOPS)	4x 4.5, 4x 9, 4x 18
FP16 \| FP16' (TFLOPS)	4x 17.9, 4x 35.9
INT8 \| INT8' (TOPS)	4x 35.9, 4x 71.8
System Interface	PCIe Gen4 (x16)
Max Power Consumption	250W
Thermal Solution	Passive
Form Factor	Full height, full length (FHFL) Dual Slot
Power Connector	8-pin CPU
Encode/Decode Engines	4 NVENC, 8 NVDEC (includes AV1 decode)
Secure and Measured Boot with Hardware Root of Trust for GPU	Yes (optional)
vGPU Software Support	NVIDIA Virtual PC (vPC), NVIDIA Virtual Applications (vApps), NVIDIA RTX Virtual Workstation (vWS), NVIDIA AI Enterprise, NVIDIA Virtual Compute Server (vCS)
Graphics APIs	DirectX 12.07, Shader Model 5.17, OpenGL 4.68, Vulkan 1.18
Compute APIs	CUDA, DirectCompute, OpenCL™, OpenACC®
MIG Support	No