This work is licensed under a Creative Commons Attribution 4.0 International License.
I am sharing my experiences setting up my Deep Learning Workstation. The main reason I am doing this documentation is to be able to redo all of the installation and configuring if needed. Furthermore, I hope this documentation helps others who are getting started with this topic. Feel free to contribute.
Advantages of using Docker for a Deep Learning Workstation are that you will only need to install the following on your host system.
- Linux OS (Ubuntu 20.04)
- Docker CE
- NVIDIA Driver
Everything else will be installed in the Docker container. Using containers you can make usage of different versions of CUDA at the same time in different containers. Therefore I believe using them is the best way in developing Deep Learning models.
(source: https://github.com/NVIDIA/nvidia-docker)
The InstallationInstructions.md file provides information on:
- Installing Ubuntu Server
- Enabling remote access via SSH
- Installing the NVIDIA driver and prior blacklisting of the Nouveau driver
- Installing Docker CE
After having installed Docker, checkout the following sections below- Docker for beginners
docker compose
examples with GPU support
Here you will find also examples for PyTorch and TensorFlow
This DockerForBeginners.md files provides information on some of my commonly used docker commands, e.g.:
- List running docker containers
- Retrieve the token of a running dockerized Jupyter notebook instance
The folder ./examples contains multiple examples for running Docker containers with GPU support.
-
nvidia-smi: examples/nvidia-cuda/README.md
Start with this example which uses a pre-built image withdocker compose
indicating how a GPU can get made accessible within a container usingdocker compose
. Afterwards try out the PyTorch or TensorFlow examples -
TensorFlow, PyTorch
After having tried out the example mentioned in 1.) try out this one which customized an image based onnvcr.io/nvidia/pytorch
andnvcr.io/nvidia/tensorflow
images.- PyTorch examples/pytorch/README.md
- TensorFlow examples/tensorflow/README.md
ℹ️ I personally prefer using docker-compose.yml
files as they offer a clean way to build images and start containers without the need of long and tedious commands on the cli or the need for hard to maintain bash scripts.
In 2018 I got a good deal on a used Lenovo ThinkStation P520 (30BE006X**) equiped as follows:
- Xeon W-2133 6C/3.6GHz/8.25MB/140W/DDR4-2666
- 900W Platinum Power Supply
- 1x 32GB RDIMM DDR4-2666 ECC
- 256GB SSD + 1 TB HDD, both SATA
- 1x NVIDIA Quadro P4000
(Data sheet)
- 1792 CUDA cores
- 8 GB GDDR5 GPU Memory
- CUDA compute capability 6.1
- 5.3 TFLOPS FP32 performance
Modifications over time:
- Replacements
- GPU: NVIDIA TITAN RTX replacing the NVIDIA Quadro P4000
- SSD: 1TB SSD (SATA)replacing the 1TB HDD
- RAM: 4x 32GB RAM (same type but different vendor) replacing the 1x 32GB RAM
- Added hardware
- SSD: 512GB SSD (NVMe)