Product | AMD Instinct™ MI100 Accelerator - 32GB HBM2 - PCIe 4.0 x16 - Passive Cooling | AMD Instinct™ MI210 Accelerator - 64GB HBM2e - PCIe 4.0 x16 - Passive Cooling | NVIDIA® A10 GPU Computing Accelerator - 24GB GDDR6 - PCIe 4.0 x16 - Passive Cooler (w/o CEC) | NVIDIA® A30 GPU Computing Accelerator - 24GB HBM2 - PCIe 4.0 x16 - Passive Cooler | NVIDIA® A40 GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling (w/o CEC) | NVIDIA® RTX A6000 - 48GB GDDR6 - PCIe 4.0 x16 - Active Cooling (4xDP) |
Price Change | ||||||
Action | Select | Select | Select | Select | Select | Select |
Main Specifications | ||||||
Product Series | AMD Instinct | AMD Instinct | Nvidia A10 | Nvidia A30 | Nvidia A40 | |
Core Type | NVIDIA TENSOR | NVIDIA TENSOR | NVIDIA TENSOR | |||
Core Clock Speed | 1502 MHz | 1700 MHz | 885 MHz (1695 MHz Boost Clock) | |||
Host Interface | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 64GB/s | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 |
GPU Architecture | CDNA | CDNA2 | Ampere | Ampere | Ampere | |
Product Type | Workstation | |||||
Product Line | NVIDIA Professional Graphics | |||||
Memory Technology | GDDR6 | |||||
Memory Capacity | 48 GB | |||||
Detailed Specifications | ||||||
Streaming Processor Cores | 7,680 | 6656 | 10752 CUDA Cores | 10752 Shading Units | ||
Compute Units | 120 | 104 | ||||
NVIDIA Tensor Cores | 336 Tensor Cores | 336 | ||||
NVIDIA RT Cores | 72 RT Cores | 84 RT Cores | 84 | |||
Memory Clock Speed | 1.2 GHz | 1.6 GHz | 1563 MHz | 2000 MHz 16 Gbps effective | ||
Memory Interface | 4096-bit | 4096-bit | 384-bit | 384-bit | ||
Memory Speeds (GT/s) | 14.5Gbps GDDR6 | |||||
Max Memory Size | 32 GB HBM2 | 64 GB HBM2e | 24 GB GDDR6 | 24 GB HBM2 | 48 GB GDDR6 with error-correcting code (ECC) | |
Max Memory Bandwidth | Up to 1228.8 GB/s | Up to 1638.4 GB/s | 600 GB/s | 933 GB/s | 696 GB/s | |
Infinity Fabric™ Links | 3 | 3 | ||||
Peak Infinity Fabric™ Link Bandwidth | 92 GB/s | 100 GB/s | ||||
Peak FP64 | 5.2 teraFLOPS | |||||
Peak FP64 Tensor Core | 10.3 teraFLOPS | |||||
INT8 Tensor Core | 250 TOPS | 500 TOPS | 330 TOPS | 661 TOPS | ||||
TF32 Tensor Core | 62.5 teraFLOPS | 125 teraFLOPS | 82 teraFLOPS | 165 teraFLOPS | ||||
FP32 | 22.6 TFLOPs | 31.2 teraFLOPS | 10.3 teraFLOPS | |||
Peak BFLOAT16 Tensor Core | 125 teraFLOPS | 250 teraFLOPS | 165 teraFLOPS | 330 teraFLOPS | ||||
Peak FP16 Tensor Core | 125 teraFLOPS | 250 teraFLOPS | 165 teraFLOPS | 330 teraFLOPS | ||||
Peak INT4 Tensor Core | 500 TOPS | 1,000 TOPS | 661 TOPS | 1321 TOPS | ||||
Total NVLink Bandwidth | Third-gen NVLINK: 200GB/s | NVIDIA NVLink 112.5 GB/s (bidirectional) PCIe Gen4 16 GB/s | ||||
NVIDIA CUDA™ Technology | Yes | |||||
Peak Half Precision (FP16) Performance | 184.6 TFLOPs | |||||
Peak Single Precision Matrix (FP32) Performance | 46.1 TFLOPs | 45.3 TFLOPs | ||||
Peak Double Precision Matrix (FP64) Performance | 45.3 TFLOPs | |||||
Peak Single Precision (FP32) Performance | 23.1 TFLOPs | |||||
Peak Double Precision (FP64) Performance | 11.5 TFLOPs | 22.6 TFLOPs | ||||
Peak INT4 Performance | 184.6 TOPs | 181 TOPs | ||||
Peak INT8 Performance | 184.6 TOPs | |||||
Peak bfloat16 | 92.3 TFLOPs | 181 TFLOPs | ||||
ECC Protection | Yes (Full-Chip) | |||||
Transistor Count | 28.3 Billion | |||||
DisplayPort Connectors | 3x DisplayPort 1.4 A40 is configured for virtualization by default with physical display connectors disabled. The display outputs can be enabled via management software tools. | |||||
OS Support | Linux x86_64 | Linux x86_64 | ||||
Cooling | Passive | Passive | Passive | Passive | ||
Dual Slot | yes | yes | Single-slot | Dual-slot | 2-slot Low-profile | |
Dimensions | 10.5" (267 mm) Board Length | 10.5" (267 mm) Board Length | FHFL | 4.4" (H) x 10.5" (L) | 4.4" (H) x 10.5" (L) | |
Form Factor | Full Height | |||||
Lithography | TSMC 7nm FinFET | 8 nm | Samsung 8nm | Samsung 8nm | ||
Supplementary Power Connectors | 2x PCIe 8-pin connectors | 1x8 pin 12V EPS | None | 1x 8-pin CPU (EPS12V) | 1x 8-pin CPU (EPS12V) | 1x 8-pin EPS |
Max Graphics Card Power (W) | 300W | 300W Peak | 150W | 165W | 300W | 300W |
Processor | Ampere (GA102) | |||||
Memory Bandwidth | 768 GB/s | |||||
Core Clock Speed | 1455 MHz Base Clock 1860 MHz Boost Clock | |||||
L2 Cache Size | 6 MB | |||||
API Support | CUDA 8.5, OpenCL 2.0 Shader Model 6.5, OpenGL 4.6, DirectX 12 Ultimate (12_2), Vulkan 1.2 | |||||
Texture Fill Rate | 625 GTexel/s | |||||
Graphics Resolution | 7680 x 4320 x36 bpp at 60 Hz | |||||
Peak Double Precision FP64 Performance | 1,250 GFLOPS (1:32) | |||||
Peak Single Precision FP32 Performance | 38.7 TFLOPS | |||||
Peak Half Precision FP16 Performance | 40.00 TFLOPS (1:1) | |||||
Multi-GPU Scalability | NVLINK 2-way low profile (2-slot and 3-slot bridges) connects 2x NVIDIA RTX A6000 | |||||
NVLink Interconnect | 112.5 GB/s (bidirectional) | |||||
VR Ready | Yes | |||||
Vulkan API | 1.2 | |||||
DisplayPort Output | 4x DisplayPort 1.4a | |||||
Minimum Recommended Power, Single Card (W) | 700W | |||||
Minimum Recommended Power, 2-Way (W) | 850 | |||||
Minimum Recommended Power, 3-Way (W) | 1000 | |||||
Minimum Recommended Power, 4-Way (W) | 1200 | |||||
Thermal Solution | Active Heatsink | |||||
Slot Height | 2-Slot | |||||
Action | Select | Select | Select | Select | Select | Select |