zl程序教程

您现在的位置是:首页 >  工具

当前栏目

[Docker] Install for tensorflow-gpu

Docker for Tensorflow GPU install
2023-09-27 14:23:24 时间

Ref: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker

ubuntu@ip-172-31-39-8:/mnt$ nvidia-smi
Thu Nov  5 08:57:35 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   58C    P0    25W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
ubuntu@ip-172-31-39-8:/mnt$ docker -v
Docker version 19.03.11, build dd360c7
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ curl https://get.docker.com | sh \
>   && sudo systemctl start docker \
>   && sudo systemctl enable docker
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:00:04 --:--:--     0
100 13857  100 13857    0     0   2504      0  0:00:05  0:00:05 --:--:--  3444
# Executing docker install script, commit: 26ff363bcf3b3f5a00498ac43694bf1c7d9ce16c
Warning: the "docker" command appears to already exist on this system.

If you already have Docker installed, this script can cause trouble, which is
why we're displaying this warning and provide the opportunity to cancel the
installation.

If you installed the current Docker package using this script and are using it
again to update Docker, you can safely ignore this message.

You may press Ctrl+C now to abort this script.
+ sleep 20




+ sudo -E sh -c apt-get update -qq >/dev/null
+ sudo -E sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq apt-transport-https ca-certificates curl >/dev/null

+ sudo -E sh -c curl -fsSL "https://download.docker.com/linux/ubuntu/gpg" | apt-key add -qq - >/dev/null
Warning: apt-key output should not be parsed (stdout is not a terminal)
+ sudo -E sh -c echo "deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable" > /etc/apt/sources.list.d/docker.list
+ sudo -E sh -c apt-get update -qq >/dev/null
+ [ -n  ]
+ sudo -E sh -c apt-get install -y -qq --no-install-recommends docker-ce >/dev/null
+ sudo -E sh -c docker version
Client: Docker Engine - Community
 Version:           19.03.13
 API version:       1.40
 Go version:        go1.13.15
 Git commit:        4484c46d9d
 Built:             Wed Sep 16 17:02:36 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.13
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       4484c46d9d
  Built:            Wed Sep 16 17:01:06 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.3.7
  GitCommit:        8fba4e9a7d01810a393d5d25a3621dc101981175
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683
If you would like to use Docker as a non-root user, you should now consider
adding your user to the "docker" group with something like:

  sudo usermod -aG docker ubuntu

Remember that you will have to log out and back in for this to take effect!

WARNING: Adding a user to the "docker" group will grant the ability to run
         containers which can be used to obtain root privileges on the
         docker host.
         Refer to https://docs.docker.com/engine/security/security/#docker-daemon-attack-surface
         for more information.
Synchronizing state of docker.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable docker
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ docker -v
Docker version 19.03.11, build dd360c7
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
>    && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
>    && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
OK
deb https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/$(ARCH) /
#deb https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/$(ARCH) /
deb https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/$(ARCH) /
#deb https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu18.04/$(ARCH) /
deb https://nvidia.github.io/nvidia-docker/ubuntu18.04/$(ARCH) /
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ curl -s -L https://nvidia.github.io/nvidia-container-runtime/experimental/$distribution/nvidia-container-runtime.list | sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
deb https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/$(ARCH) /
deb https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu18.04/$(ARCH) /
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ sudo apt-get update
Hit:1 http://ap-southeast-2.ec2.archive.ubuntu.com/ubuntu bionic InRelease
Hit:2 http://ap-southeast-2.ec2.archive.ubuntu.com/ubuntu bionic-updates InRelease                                                                                                                         
Get:3 http://ap-southeast-2.ec2.archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]                                                                                                             
Hit:4 https://download.docker.com/linux/ubuntu bionic InRelease                                                                                                                                             
Get:5 https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/amd64  InRelease [1158 B]                                                                                                       
Get:6 https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu18.04/amd64  InRelease [1149 B]                                                         
Get:7 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64  InRelease [1139 B]
Get:8 https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/amd64  InRelease [1136 B]     
Get:9 https://nvidia.github.io/nvidia-docker/ubuntu18.04/amd64  InRelease [1129 B]            
Hit:10 http://security.ubuntu.com/ubuntu bionic-security InRelease                                       
Hit:11 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic InRelease
Get:12 https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/amd64  Packages [3392 B]
Get:13 https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu18.04/amd64  Packages [804 B]
Get:14 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64  Packages [9128 B]
Get:15 https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/amd64  Packages [6148 B]
Get:16 https://nvidia.github.io/nvidia-docker/ubuntu18.04/amd64  Packages [4332 B]
Fetched 104 kB in 2s (62.5 kB/s)     
Reading package lists... Done
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ sudo apt-get install -y nvidia-docker2
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  libnvidia-container-tools libnvidia-container1 nvidia-container-runtime nvidia-container-toolkit
The following NEW packages will be installed:
  libnvidia-container-tools libnvidia-container1 nvidia-container-runtime nvidia-container-toolkit nvidia-docker2
0 upgraded, 5 newly installed, 0 to remove and 16 not upgraded.
Need to get 1471 kB of archives.
After this operation, 4683 kB of additional disk space will be used.
Get:1 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64  libnvidia-container1 1.3.0-1 [67.0 kB]
Get:2 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64  libnvidia-container-tools 1.3.0-1 [20.4 kB]
Get:3 https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/amd64  nvidia-container-toolkit 1.3.0-1 [763 kB]
Get:4 https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/amd64  nvidia-container-runtime 3.4.0-1 [615 kB]
Get:5 https://nvidia.github.io/nvidia-docker/ubuntu18.04/amd64  nvidia-docker2 2.5.0-1 [5912 B]
Fetched 1471 kB in 0s (26.4 MB/s)        
Selecting previously unselected package libnvidia-container1:amd64.
(Reading database ... 108331 files and directories currently installed.)
Preparing to unpack .../libnvidia-container1_1.3.0-1_amd64.deb ...
Unpacking libnvidia-container1:amd64 (1.3.0-1) ...
Selecting previously unselected package libnvidia-container-tools.
Preparing to unpack .../libnvidia-container-tools_1.3.0-1_amd64.deb ...
Unpacking libnvidia-container-tools (1.3.0-1) ...
Selecting previously unselected package nvidia-container-toolkit.
Preparing to unpack .../nvidia-container-toolkit_1.3.0-1_amd64.deb ...
Unpacking nvidia-container-toolkit (1.3.0-1) ...
Selecting previously unselected package nvidia-container-runtime.
Preparing to unpack .../nvidia-container-runtime_3.4.0-1_amd64.deb ...
Unpacking nvidia-container-runtime (3.4.0-1) ...
Selecting previously unselected package nvidia-docker2.
Preparing to unpack .../nvidia-docker2_2.5.0-1_all.deb ...
Unpacking nvidia-docker2 (2.5.0-1) ...
Setting up libnvidia-container1:amd64 (1.3.0-1) ...
Setting up libnvidia-container-tools (1.3.0-1) ...
Setting up nvidia-container-toolkit (1.3.0-1) ...
Setting up nvidia-container-runtime (3.4.0-1) ...
Setting up nvidia-docker2 (2.5.0-1) ...
Processing triggers for libc-bin (2.27-3ubuntu1.2) ...
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ sudo systemctl restart docker
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
Unable to find image 'nvidia/cuda:11.0-base' locally
11.0-base: Pulling from nvidia/cuda
54ee1f796a1e: Pull complete 
f7bfea53ad12: Pull complete 
46d371e02073: Pull complete 
b66c17bbf772: Pull complete 
3642f1a6dfb3: Pull complete 
e5ce55b8b4b9: Pull complete 
155bc0332b0a: Pull complete 
Digest: sha256:774ca3d612de15213102c2dbbba55df44dc5cf9870ca2be6c6e9c627fa63d67a
Status: Downloaded newer image for nvidia/cuda:11.0-base
Thu Nov  5 09:04:59 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   44C    P0    21W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ 
ubuntu@ip-172-31-39-8:/mnt$ docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
nvidia/cuda         11.0-base           2ec708416bb8        2 months ago        122MB
by running a base CUDA container

先确保hosting pc上的driver安装好,再安装如下命令。

有可能,因默认的版本问题而导致代码执行出现各种小问题。

FROM tensorflow/tensorflow:latest-gpu

MAINTAINER sbll@gmail.com

RUN pip install tensorflow-gpu==1.14 \
    && pip install keras==2.3.1 \
    && pip install numpy \
    && pip install scikit-image \
    && pip install efficientnet \
    && pip install awscli --upgrade --user \
    && pip install boto3


COPY ./package /package

WORKDIR "/package"

CMD python hello.py; python world.py

 

/* continue */