At the 2016 GPU Technology Conference, NVIDIA announced their massive performance leap for the Deep Learning field and HPC applications with the NVIDIA Tesla P100 Accelerator. Additionally, NVIDIA unveiled the DGX-1, the world’s first Deep Learning Supercomputer, powered by eight Tesla P100 GPUs. Exxact Corporation will be the official launch partner to offer the NVIDIA DGX-1 for data scientists and artificial intelligence researchers. Click here to order your NVIDIA DGX-1 today.

The Tesla P100 GPU is introduced as the most advanced hyperscale data center accelerator ever built and is NVIDIA’s first Tesla card powered by the Pascal architecture, which was first mentioned at GTC 2014. It enables a new class of servers capable of delivering the performance of hundreds of CPU server nodes. With artificial intelligence and scientific applications today requiring ultra-efficient and lightning-fast server nodes, the news about the Tesla P100 is exciting as the accelerator delivers unprecedented performance, scalability and programming efficiency.


The P100 also supports NVIDIA’s NVLink technology, a proprietary interconnect that allows multiple GPUs, or supporting CPUs, to connect directly with each other at a higher bandwidth than the current PCI Express 3.0 slots. Furthermore, it also supports up to eight GPU connections versus the four of PCIe and SLI.

NVIDIA cites five architectural breakthroughs that make the Tesla P100 the perfect GPU to power the most computationally demanding applications:

NVIDIA Pascal architecture for exponential performance leap – A Pascal-based Tesla P100 solution delivers over a 12x increase in neural network training performance compared with a previous-generation NVIDIA Maxwell™-based solution.

NVIDIA NVLink for maximum application scalability – The NVIDIA NVLink™ high-speed GPU interconnect scales applications across multiple GPUs, delivering a 5x acceleration in bandwidth compared to today’s best-in-class solution1. Up to eight Tesla P100 GPUs can be interconnected with NVLink to maximize application performance in a single node, and IBM has implemented NVLink on its POWER8 CPUs for fast CPU-to-GPU communication.

16nm FinFET for unprecedented energy efficiency – With 15.3 billion transistors built on 16 nanometer FinFET fabrication technology, the Pascal GPU is the world’s largest FinFET chip ever built2. It is engineered to deliver the fastest performance and best energy efficiency for workloads with near-infinite computing needs.

CoWoS with HBM2 for big data workloads – The Pascal architecture unifies processor and data into a single package to deliver unprecedented compute efficiency. An innovative approach to memory design, Chip on Wafer on Substrate (CoWoS) with HBM2, provides a 3x boost in memory bandwidth performance, or 720GB/sec, compared to the Maxwell architecture.

New AI algorithms for peak performance – New half-precision instructions deliver more than 21 teraflops of peak performance for deep learning.

Here are NVIDIA Tesla P100 Technical Specifications:

– 5.3 teraflops double-precision performance, 10.6 teraflops single-precision performance and 21.2 teraflops half-precision performance with NVIDIA GPU BOOST™ technology
– 160GB/sec bi-directional interconnect bandwidth with NVIDIA NVLink
– 16GB of CoWoS HBM2 stacked memory
– 720GB/sec memory bandwidth with CoWoS HBM2 stacked memory
– Enhanced programmability with page migration engine and unified memory
– ECC protection for increased reliability
– Server-optimized for highest data center throughput and reliability

The Pascal-based NVIDIA Tesla P100 GPU accelerator is available now but only in the NVIDIA DGX-1 system. General availability for the Tesla P100 is expected in early 2017. Be sure to check back with us for more information!