NVIDIA Ampere Architecture
NVIDIA RTX A2000 12GB is the most powerful dual slot, low profile GPU solution offering high performance real-time ray tracing, AI-accelerated compute, and professional graphics rendering, all of this in a compact design with unmatched performance. Building upon the major SM enhancements from the Turing GPU, the NVIDIA Ampere architecture enhances ray tracing operations, tensor matrix operations, and concurrent executions of FP32 and INT32 operations.
CUDA Cores
The NVIDIA Ampere architecture-based CUDA cores bring up to 2.7x the single-precision floating point (FP32) throughput compared to the previous generation, providing significant performance improvements for graphics workflows such as 3D model development and compute for workloads such as desktop simulation for computer-aided engineering (CAE). The RTX A2000 12GB enables two FP32 primary data paths, doubling the peak FP32 operations.
Second Generation RT Cores
Incorporating second generation ray tracing engines, NVIDIA Ampere architecture-based GPUs provide incredible ray traced rendering performance. For the first time NVDIA is introducing RT Cores into a low profile form factor GPU. A single RTX A2000 12GB board can render complex professional models with physically accurate shadows, reflections, and refractions to empower users with instant insight. Working in concert with applications leveraging APIs such as NVIDIA OptiX, Microsoft DXR and Vulkan ray tracing, systems based on the RTX A2000 12GB will power truly interactive design workflows to provide immediate feedback for unprecedented levels of productivity. The RTX A2000 12GB is more than 5x faster in ray tracing compared to the previous generation. This technology also speeds up the rendering of ray-traced motion blur for faster results with greater visual accuracy.
Third Generation Tensor Cores
Purpose-built for deep learning matrix arithmetic at the heart of neural network training and inferencing functions, the RTX A2000 12GB includes enhanced Tensor Cores that accelerate more datatypes and includes a new Fine-Grained Structured Sparsity feature that delivers more than 2x throughput for tensor matrix operations compared to the previous generation. New Tensor Cores will accelerate two new TF32 and BFloat16 precision modes. Independent floating-point and integer data paths allow more efficient execution of workloads using a mix of computation and addressing calculations.
Unified Memory
A single, seamless 49-bit virtual address space allows for the transparent migration of data between the full allocation of CPU and GPU memory.
NVIDIA RTX IO
Accelerating GPU-based lossless decompression performance by up to 100x and 20x lower CPU utilization compared to traditional storage APIs using Microsoft’s new DirectStorage for Windows API. RTX IO moves data from the storage to the GPU in a more efficient, compressed form, and improving I/O performance.
There are no reviews yet.