Docker
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
ML Docker Image installed on the Interaction Station ML computers (Ubuntu 16.04):
Installing Docker CE:
- sudo apt-get install apt-transport-https ca-certificates curl software-properties-common
- curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
- sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu xenial stable"
- sudo apt-get update
- More info: https://unix.stackexchange.com/questions/363048/unable-to-locate-package-docker-ce-on-a-64bit-ubuntu
Change Docker root dir using systemd (Don't do this, set volume instead)
- systemctl status docker.service
- sudo nano /etc/default/docker
- Edit ExecStart line to look like this ExecStart =/usr/bin/dockerd -g /media/MachineLearning/docker -H fd://
- systemctl daemon-reload
- systemctl restart docker
- sudo docker info - verify the root dir has updated
- https://github.com/IronicBadger/til/blob/master/docker/change-docker-root.md
Docker - clean up all the volumes
- sudo docker system prune -a -f --volumes
Installing nvidia-docker v1 (deprecated!):
- docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
- sudo apt-get purge -y nvidia-docker
- curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
- sudo apt-key add -
- distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
- curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
- sudo tee /etc/apt/sources.list.d/nvidia-docker.list
- sudo apt-get update
- sudo apt-get install -y nvidia-docker
- sudo pkill -SIGHUP dockerd
- #Test nvidia-smi with the latest official CUDA image
- docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi
- Link:
- https://github.com/NVIDIA/nvidia-docker
Installing docker-compose:
Installing nvidia-docker-compose:
- pip install nvidia-docker-compose
- link: https://hackernoon.com/docker-compose-gpu-tensorflow-%EF%B8%8F-a0e2011d36
- Permission Denied on curl and save for docker compose: https://github.com/docker/machine/issues/652
Using Docker with nvidia-docker-compose
- Public docker repository (When doing FROM in Dockerfile, we need to select one of those)
- https://hub.docker.com/
- Dir structure:
- docker-compose.yml
- deepo
- deepo/do_not_finish.sh
- deepo/Dockerfile
- deepo_data (folder that is visible by deepo image)
- docker-compose.yml:
version: '3' services: #machine name deepo: #container name container_name: deepo #path to Dockerfile build: deepo command: sh do_not_finish.sh volumes: - ./deepo_data:/media/deepo_data tty: true
- Dockerfile:
FROM ufoym/deepo ADD do_not_finish.sh /
- Dockerfiles guide:
- https://rock-it.pl/how-to-write-excellent-dockerfiles/
- do_not_finish.sh:
- !/bin/bash
sh -c 'while :; do sleep 100; done'
- We need that endless loop, because docker-compose closes the container when is deployed
- The endless loop allowed us to use it with a docker exec
Run it
- Steps 1 and 2: Within the folder where is the docker-compose.yml file
- sudo nvidia-docker-compose build
- sudo nvidia-docker-compose up
- Step 3: From another terminal:
- sudo nvidia-docker exec -it deepo bash
Troubleshooting problems
- Check nvidia-docker version (needs to be version 1)
- nvidia-docker version
- More info:
- https://github.com/eywalker/nvidia-docker-compose/issues/26
- Permission denied: u'./docker-compose.yml
- https://github.com/docker/docker-snap/issues/26
Deepo
It includes:
- cudnn
- theano
- tensorflow
- sonnet
- pytorch
- keras
- lasagne
- mxnet
- cntk
- chainer
- caffe
- caffe2
- torch
Installing Deepo:
Run Deepo image with Docker:
- sudo nvidia-docker run -it ufoym/deepo:gpu bash
Run Deepo image with Docker (with python 2.7):
- sudo nvidia-docker run -it ufoym/deepo:py27 bash