Installing and Using DOCKER and NV-DOCKER on CentOS 7

DOCKER-ENGINE is a containerization technology that allows you to create, develop and run applications. In this article we will focus primarily on the basic installation steps for DOCKER and NV-DOCKER, and the ability for DOCKER, working with NV-DOCKER (a wrapper that NVIDIA provides) to provide a stable platform for pulling docker images, which are used to create containers. Containers are ‘instances’ of an environment, which are created based on the docker image. The containers can be run once, or live on as persistent daemon processes – for which there will be examples of below.

 

Installing and getting DOCKER and NV-DOCKER running in CentOS 7 is a straight forward process:
# Assumes CentOS 7
# Assumes NVIDIA Driver is installed as per requirements ( < 340.29 )
# Install DOCKER
sudo curl -fsSL https://get.docker.com/ | sh
# Start DOCKER
sudo systemctl start docker
# Add dockeruser, usermod change
sudo adduser dockeruser
usermod -aG docker dockeruser
# Install NV-DOCKER
# GET NVIDIA-DOCKER
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker-1.0.1-1.x86_64.rpm
# INSTALL
sudo rpm -i /tmp/nvidia-docker*.rpm
# Start NV-DOCKER Service
systemctl start nvidia-docker

After the steps above you should have a running Docker and NVIDIA-DOCKER services

This can be checked via:

[username@host ~]# systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2017-03-23 20:59:01 PDT; 16h ago


..... truncated ....


[username@host ~]# systemctl status nvidia-docker
● nvidia-docker.service - NVIDIA Docker plugin
   Loaded: loaded (/usr/lib/systemd/system/nvidia-docker.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2017-03-23 20:58:59 PDT; 17h ago

..... truncated ....

 

Using Docker/NVIDIA-DOCKER

 

Pull and Run your First Container

For a quick and dirty test, using NVIDIA GPUs in a container. This can be done from either sudo/root or su – dockeruser from the install instructions above.

you can run the following example :

# Instantiate a container from the nvidia-docker command.
# Note, that nvidia-docker must be run when using any command with docker that involves "run" that you would like to use GPUs with. nvidia-docker is a wrapper that handles setting up the environment (container) in relation to GPUs, GPGPU, Etc.
nvidia-docker run --rm nvidia/cuda nvidia-smi

 

Command Explanation:
  • nvidia-docker – the NVIDIA shim/wrapper that helps setup GPUs with DOCKER
  • run – tells nvidia-docker wrapper that you’re going to start (instantiate) a container
    • Note that for any command that does not include ‘run’ in it, you can simply use docker, but if you use nvidia-docker the command gets passed through to docker (E.g docker images displays the docker images on your system, nvidia-docker images would also execute and show the same info)
  • –rm – this tells DOCKER that after the command runs, the container should be stopped/removed
    • This is a very interesting feature / capability. If you think about it, an entire environment is being created, for nvidia-smi to run and then the container is destroyed. It can be done repeatedly and is very simple and fast.
  • nvidia/cuda – this is the name of an image
    • Note that, the first time you run this command, DOCKER will go out and find an image with that name, and download the docker image from the hub.docker.com repository. This will only happen the first time. You could also run docker pull nvidia/cuda before hand to be verbose and separate the steps. This one liner works though.
  • nvidia-smi – this is the command to be run on the container

You should get output that looks like the below:

Note that the Pull complete portions (The parts above the nvidia-smi output) are a one time occurrence as the image is not on your system locally and is being fetched to launch the image into a container instance.

[user@host ~]# nvidia-docker run --rm nvidia/cuda nvidia-smi
Using default tag: latest
latest: Pulling from nvidia/cuda
d54efb8db41d: Pull complete
f8b845f45a87: Pull complete
e8db7bf7c39f: Pull complete
9654c40e9079: Pull complete
6d9ef359eaaa: Pull complete
cdfa70f89c10: Pull complete
3208f69d3a8f: Downloading 151.3 MB/421.5 MB
eac0f0483475: Download complete
4580f9c5bac3: Verifying Checksum
6ee6617c19de: Downloading   109 MB/456.1 MB
Fri Mar 24 20:47:52 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.48                 Driver Version: 367.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 0000:03:00.0      On |                  N/A |
| 27%   34C    P8     7W / 180W |   7725MiB /  8113MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

 

Running a persistent container / NVIDIA DIGITS

The following below is meant to demonstrate pulling a DIGITS image and running it in daemon/persistent mode.

It should be noted that in order to use DIGITS you will need to provide it data via the -v command line switch when launching the docker container, which you utilize to map a mount point on the local machine to a mount point within the container, for example: -v /mnt/dataset:/data/dataset This would map /mnt/dataset on the host machine to /data/dataset in the container. When interacting with DIGITS you would be able to see this data when creating datasets, etc from the Web UI.

 

Running nvidia-docker
[user@host~]# NV_GPU=0,1 nvidia-docker run --name digits -d -p 5000:5000 nvidia/digits
6b12a4107569214a3177304ef2c9db0f333e266d0d766d2c8c02e5bbddd3d444 # This is the Instance ID launched from the nvidia-docker run command

 

Command Explanation:
  • NV_GPU=0,1
    • This is a method of assigning GPU resources to a container which is critical for leveraging DOCKER in a Multi GPU System. This passes GPU ID 0,1 from the host system to the container as resources. Note that if you passed GPU ID 2,3 for example, the container would still see the GPUs as ID 0,1 inside the container, with the PCI ID of 2,3 from the host system.
  • nvidia-docker – the NVIDIA shim/wrapper that helps setup GPUs with DOCKER
  • run – tells nvidia-docker wrapper that you’re going to start (instantiate) a container
    • Note that for any command that does not include ‘run’ in it, you can simply use docker, but if you use nvidia-docker the command gets passed through to docker (E.g docker images displays the docker images on your system, nvidia-docker images would also execute and show the same info)
  • –name digits
    • This names your container instance, you need a unique name for each instance created in this way. It adds another way for instances to be referenced by, the default method is an instance ID Hash
  • -d
    • Instructs DOCKER that this will be a daemonized/persistent container
  • -p 5000:5000
    • This is a way of port mapping. 5000 is being mapped to 5000 for the DIGITS web server port.
    • If you ran multiple containers/instances of DIGITS, for example you could do -p 5001:5000 for the next container and you would be able to connect to it at the IP_ADDRESS:5001 location, and still connect to IP_ADDRESS:5000 of the other DIGITS container.
  • nvidia/digits
    • Which image we’re launching

After running this command, you could connect to DIGITS at the URL of the host system, at port 5000. It would have access to GPU ID 0,1 as resources within the container and within DIGITS in that container. If, for example this was a 4 GPU machine, you could run the following to create another container, based on that same image, but expose a different port so that the two containers don’t conflict with each other, and specifiy different GPUs so the containers don’t try and utilize the same GPGPU resources.

[user@host~]# NV_GPU=2,3 nvidia-docker run --name digits -d -p 5001:5000 nvidia/digits
95e42817050c3e6de88f61473692a71ac0ab0948fe873c06155b95b62dad5554 # Instance ID!

Now you would have another DIGITS instance on port 5001 that would be accessible from a web browser, and this DIGITS installation would have access to the GPU ID 2 and 3 from the host system.

 

Check Running Containers

You can check your running containers/instances by running either nvidia-docker ps or docker ps, see below for an example :

Note the PORTS section which is very helpful once you get containers up and running to see how they are mapped.

CONTAINER ID        IMAGE                              COMMAND              CREATED             STATUS              PORTS                              NAMES
95e42817050c        nvidia/digits                      "python -m digits"   25 seconds ago      Up 24 seconds       0.0.0.0:5001->5000/tcp             digits1
6b12a4107569        nvidia/digits                      "python -m digits"   16 hours ago        Up 16 hours         0.0.0.0:5000->5000/tcp             digits