As GPUs have proliferated and become more common in high performance computing (HPC), NVLink provides a significant performance benefit for GPU-to-GPU (peer-to-peer) communication. NVLink is an energy-efficient, high-bandwidth path between GPUs that produces significant speed-ups in application performance and creating high-density, scalable servers for accelerated computing. It offers 5 to 12 times the bandwidth over PCIe and is available with NVIDIA Pascal GPUs (SXM2).

How it works

To see how NVLink technology works, let’s take a look at the Exxact Tensor TXR410-3000R which features the NVLink high-speed interconnect and 8x Tesla P100 Pascal GPUs. NVLink interconnects multiple GPUs (up to eight Tesla P100 in this case). Each GPU has four interconnects that total 80GB/s of bandwidth. Below is an example of two sets of quad P100s directly connected to each other.


This provides up to 160GB/s of GPU bandwidth to peers, load/store access to Peer Memory, full atomics to Peer GPUs, and high speed copy engines for bulk data copy. Connection to CPU is via PCIe, but GPU-CPU interconnect is also available with NVLink port enhanced CPUs.


NVLink improves application performance by speeding up data movement in multi-GPU configurations. Applications that rely on exchanging data across GPUs can run much faster using NVLink than through the PCIe bus. Below is a list of some applications that can benefit from NVLink:

• Multi-GPU Exchange and sort
• Fast Fourier Transform (FFT)
• AMBER – Molecular Dynamics (PMEMD)
• ANSYS Fluent – Computational Fluid Dynamics
• Lattice Quantum Chromodynamics (LQCD)

