install tensorflow-gpu in docker on RHEL7

0. make sure that you have installed cuda

Check environment

  1. Verify you have a CUDA-capable GPU, run the command below and you will see the graphic info
    1
    lspci | grep -i nvidia
  2. Verify you have a supported version of Linux, check the version of Linux to install right version of CUDA
    1
    uname -m && cat /etc/*release
  3. Verify the system has gcc Installed. Or install gcc firstly before installing CUDA.
    1
    gcc --version

Install steps (by runfile)

  1. (optional)Install kernel headers and development packages for the currently running kernel. If you have installed these packages, you could skip this step.
    1
    sudo yum install kernel-devel-
    (uname -r) kernel-headers-
    
    
    (uname -r)
  2. Disable the Nouveau drivers
    a. Create a file at /etc/modprobe.d/blacklist-nouveau.conf with below contents
    1
    2
    blacklist nouveau
    options nouveau modeset=0
    b. Regenerate the kernel initramfs
    1
    sudo dracut –force
  3. (optional)Reboot to the graphical interface, and change to command interface with command sudo init 3. This step is to install graphic card driver.
  4. Run the installer (runfile) with command sudo sh cuda__linux.run. If you choose to install graphic card driver, you must run step 3 first.
  5. Create an xorg.conf file to use the NVIDIA GPU for display:
    1
    sudo nvidia-xconfig
  6. Reboot the system to load the graphical interface
  7. Set up the development environment by modifying the PATH and LD_LIBRARY_PATH variables
    1
    2
    export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
    export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
  8. Run the nbody sample to make sure whether CUDA works or not.
    1
    2
    3
    cd /usr/local/cuda/samples/5_Simulations/nbody
    sudo make
    ./nbody

1. install nvidia-docker

Refer to https://github.com/nvidia/nvidia-docker/wiki/Installation-(version-2.0)

2. pull image from tensorflow/tensorflow

1
$ docker pull tensorflow/tensorflow:latest-gpu-py3

3. run docker container

1
$ docker run -it --runtime=nvidia --rm tensorflow/tensorflow:latest-gpu-py3

4. check if it works or not

1
$ python -c "import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))"

5. reference

[1] https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
[2] https://github.com/nvidia/nvidia-docker/wiki
[3] https://www.tensorflow.org/install/docker

附: docker-compose.yml

1
2
3
4
5
6
7
version '2.3'
services:
tensorflow-gpu:
image: tensorflow/tensorflow:latest-gpu-py3
runtime: nvidia
environment:
- NVIDIA_VISIBLE_DEVICES=all


install tensorflow-gpu in docker on RHEL7
https://r-future.github.io/post/intall-tensorflow-gpu-in-docker/
Author
Future
Posted on
July 15, 2019
Licensed under