NVIDIA A800 40GB Active Graphics Card

The ultimate workstation development platform for AI, data science, and HPC.

Where to Buy

Find an NVIDIA design and visualization partner.

The Supercomputing Platform for Workstations

The NVIDIA® A800 40GB Active GPU, powered by the NVIDIA Ampere architecture, is the ultimate workstation development platform with NVIDIA AI Enterprise software included, delivering powerful performance to accelerate next-generation data science, AI, HPC, and engineering simulation/CAE workloads.

Get Started With NVIDIA A800 40GB Active

Learn how to set up NVIDIA A800 40GB Active with a companion GPU for display and activate the NVIDIA AI Enterprise license.

Highlights

Industry Leading Performance

Double-Precision (FP64) Performance

9.7 TFLOPS¹

Tensor Performance

1,247 AI TOPS²

Memory Bandwidth

1.5 TB/s

1 Peak rates based on GPU Boost Clock. 2 Theoretical INT8 TOPS using sparsity.

Features

Powered by the NVIDIA Ampere Architecture

Third-Generation Tensor Cores

Performance and versatility for a wide range of AI and HPC applications with support for double-precision (FP64) and Tensor Float 32 (TF32) precision provides up to 2X the performance and efficiency over the previous generation, enabling rapid model training and inferencing directly on RTX-powered AI workstations. Hardware support for structural sparsity doubles the throughput for inferencing.

Multi-Instance GPU

Fully isolated and secure multi-tenancy at the hardware level with dedicated high-bandwidth memory, cache, and compute cores. Multi-Instance GPU (MIG) maximizes the utilization of GPU-accelerated infrastructure, allowing an A800 40GB Active GPU to be partitioned into as many as seven independent instances, giving multiple users access to GPU acceleration.

Third-Generation NVIDIA NVLink

Increased GPU-to-GPU interconnect bandwidth provides a single scalable memory to accelerate compute workloads and tackle larger datasets. Connect a pair of NVIDIA A800 40GB Active GPUs with NVIDIA NVLink® to increase the effective memory footprint to 80GB and scale application performance by enabling GPU-to-GPU data transfers at rates up to 400 GB/s (bidirectional).

Ultra-Fast HBM2 Memory

Deliver massive computational throughput with 40GB of high-speed HBM2 memory with a class-leading 1.5 TB/s of memory bandwidth—an over 70% increase compared to previous generation—and significantly more on-chip memory, including a 40MB level 2 cache to accelerate the most computationally intense AI and HPC workloads.

Workloads

Supercharge AI and HPC Workflows Across Industries

Generative AI

Using neural networks to identify patterns and structures within existing data, generative AI applications enable users to generate new and original content from a wide variety of inputs and outputs, including images, sounds, animation, and 3D models. Leverage NVIDIA’s generative AI solution—NeMo™ Framework, included in NVIDIA AI Enterprise—along with the A800 40GB Active GPU for easy, fast, and customizable generative AI model development.

Montage of a sunset view, a protein, and a toy Jensen.

Engineering Simulation/CAE

The A800 40GB Active GPU delivers remarkable performance for GPU-accelerated computer-aided engineering (CAE) applications. Engineering Analysts and CAE Specialists can run large-scale simulations and engineering analysis codes in full FP64 precision with incredible speed, shortening development timelines and accelerating time to value.

With the addition of RTX-accelerated GPUs, providing display capabilities for pre- and post-processing, designers and engineers can visualize large-scale simulations and models in full-design fidelity.

Simulated image  of an engine

Data Science and Data Analytics

Accelerate end-to-end data science and analytics workflows with powerful performance to extract meaningful insights from large-scale datasets quickly. By combining the high-performance computing capabilities of the A800 40GB Active with NVIDIA AI Enterprise, data practitioners can leverage a large collection of libraries, tools, and technologies to accelerate data science workflows—from data prep and analysis to modeling.

Computer accessing a data base and extracting insights.

AI Training and Inference

Offload data center and cloud-based computing resources and bring supercomputing performance to the desktop for local AI training and inference workloads. Powerful workstations with four A800 40GB Active GPUs provide over 2.5 petaflops of AI computing performance and 160GB of combined HBM2 memory.

Process of taking an LLM, optimizing it, and deploying an end solution.

AI Platform

Production-Ready AI With NVIDIA AI Enterprise

Out-of-the-Box AI Development

Each NVIDIA A800 40GB Active GPU comes with a three-year subscription to NVIDIA AI Enterprise, an end-to-end enterprise software platform for rapid development and deployment of production-ready generative AI, computer vision, speech AI, and more. Software activation is required.

Accelerated Data Pipelines

NVIDIA AI Enterprise includes data science libraries and tools to speed time to insights. Organizations can use NVIDIA RAPIDS™ for up to 50X faster end-to-end data science pipelines.

AI Training and Inference

NVIDIA AI Enterprise accelerates every stage of the AI journey, from data prep and model training through inference and deployment at scale:

Performance

Tackle Demanding AI and HPC Workloads

The NVIDIA A800 40GB Active GPU delivers incredible performance to conquer the most demanding workflows on workstation platforms—from AI training and inference, to complex engineering simulations, modeling, and data analysis. With more than 2X the performance of the previous generation, the A800 40GB Active supports a wide range of compute-intensive workloads flawlessly.

AI Training - ResNet-50 V1.5

ResNet-50 V1.5 Training. Batch Size=256; Precision=Mixed.

AI Training - BERT - Large

BERT Large Pre-Training Phase 2 Batch Size=8; Precision=Mixed.

HPC - GTC

GTC Version 4.5, TAE, Precision=FP32.

HPC - LAMMPS

LAMMPS patch_8Feb2023, Atomic Fluid Lennard-Jones 2.5 (cutoff); Precision=FP64.

AI Inference - ResNet-50 V1.5

ResNet-50 V1.5 Inference. Batch Size=128; Precision=Mixed.

AI Inference - BERT - Large

BERT Large Inference. Batch Size=128; Precision=INT8.

Performance testing with A800 40GB Active and Quadro GV100 GPUs and Intel Xeon Gold 6126 processor.

Supercomputing Performance on Desktop Workstations

Offload demand for data center resources with NVIDIA RTX™-powered AI workstations that deliver the power of a supercomputer to the desktop. Workstation platforms equipped with the latest NVIDIA RTX GPUs and NVIDIA AI Enterprise software provide powerful AI performance to build, train, and deploy the next generation of AI-augmented applications and models.

Specifications

NVIDIA A800 40GB Active

GPU Memory 40GB HBM2
Memory Interface 5,120-bit
Memory Bandwidth 1,555.2 GB/s
CUDA Cores 6,912
Tensor Cores 432
Double-Precision Performance 9.7 TFLOPS
Single-Precision Performance 19.5 TFLOPS
Peak Tensor Performance 1,247 AI TOPS | 623.8 TFLOPS
Multi-Instance GPU Up to 7 MIG instances @ 5GB
NVIDIA NVLink Yes
NVLink Bandwidth 400GB/s
Graphics Bus PCIe 4.0 x 16
Max Power Consumption 240W
Thermal Active
Form Factor 4.4” H x 10.5” L, dual slot
Display Capability*

Get Started

Ready to Purchase?

Talk with an NVIDIA design and visualization partner.

Need Help Selecting the Right Product or Partner?

Talk to an NVIDIA product specialist about your professional needs.

Get the Latest on NVIDIA RTX

Sign up for the latest news, updates, and more from NVIDIA.

Sign Up To Be Notified On Availability

NVIDIA A800 40GB Active Quick Specs