The CUDA 9.1 Update
NVIDIA just pushes out the new CUDA 9.1 update that brings new algorithms and optimizations that speed up AI and HPC apps on Volta GPUs. With this release you can:
- Develop image augmentation algorithms for deep learning easily with new functions in NVIDIA Performance Primitives
- Run batched neural machine translations and sequence modeling operations on Volta Tensor cores using new APIs in cuBLAS
- Solve large 2D and 3D FFT problems more efficiently on multi-GPU systems with new heuristics in cuFFT
- Launch kernels up to 12x faster with new core optimizations
CUDA 9.1 also includes compiler optimizations, support for new developer tool versions and bug fixes. For the full release notes, click here.
What is CUDA 9?
CUDA 9 is the most powerful software platform for GPU-accelerated applications. It has been built for Volta GPUs and includes faster GPU-accelerated libraries, a new programming model for flexible thread management, and improvements to the compiler and developer tools. With CUDA 9 you can speed up your applications while making them more scalable and robust. Included in our Deep Learning software stack, CUDA is a primary software platform featured in Exxact Deep Learning Solutions. Here is an overview of the features included in CUDA 9:
- Support for the Volta GPU architecture, including the new Tesla V100 accelerator;
- Cooperative Groups, a new programming model for managing groups of communicating threads;
- A new API (preview feature) for programming Tensor Core matrix multiply and accumulate operations on Tesla V100.
- Faster library routines for linear algebra, image processing, FFTs, and more;
- New algorithms in cuSolver and nvGraph
- New NVIDIA Visual Profiler support for Volta V100 as well as improved Unified Memory profiling features;
- Improved compiler performance;
- Support for C++14 in CUDA device code;
- Expanded developer platform and host compiler support including Microsoft Visual Studio 2017, clang 3.9, PGI 17.1 and GCC 6.x;