Deep Learning with GPUs
The training phase of your deep learning model may be very time consuming. To accelerate this process you may want to use GPUs and you will need to install the deep learning packages, such as Keras or PyTorch, properly. Here is a short documentation on how to install some well known deep learning packages in Python and R. If you encounter any problem during the installation or if you need to install other deep learning packages (in Python, R or other programming languages), please send an email to helpdesk@unil.ch with subject DCSR: Deep Learning package installation, and we will try to help you.
Keras
To install the packages in your home directory:
cd $HOME
Log into a GPU node:
Sinteractive -p interactive -m 4G -G 1
Check that the GPU is visible:
nvidia-smi
Load parallel modules and python:
module purge
module load gcc cuda cudnn python/3.8.8
Create a virtual environment. Here we will call it "venv_keras_gpu", but you may choose another name:
virtualenv -p python venv_keras_gpu
Activate the virtual environment:
source venv_keras_gpu/bin/activate
Install TensorFlow and Keras:
pip install tensorflow
pip install keras
Check that Keras was properly installed:
python -c 'import keras; print(keras.__version__)'
There might be a warning message and the output should be something like "2.5.0".
You may install extra packages that you deep learning code will use. For example:
pip install sklearn
pip install pandas
pip install matplotlib
Deactivate your virtual environment and logout from the GPU node:
deactivate
exit
Comment
If you want to make your installation more reproducible, you may proceed as follows:
1. Create a file called "requirements.txt" and write the package names inside. You may also specify the package versions. For example:
tensorflow==2.4.1
keras==2.4.0
sklearn==0.24.2
pandas==1.2.4
mathplotlib==3.4.2
2. Proceed as above, but instead of installing the packages individually, type
pip install -r requirements.txt
Run your deep learning code
To test your deep learning code (maximum 1h), say "my_deep_learning_code.py", you may use the interactive mode:
cd /scratch/username/
Sinteractive -p interactive -m 4G -G 1
module load gcc cuda cudnn python/3.8.8
source $HOME/venv_keras_gpu/bin/activate
Run your code:
python my_deep_learning_code.py
or copy/paste your code inside a python environment:
python
copy/paste your code
Once you have finished testing your code, you must close your interactive session (by typing exit), and then run it on the cluster by using an sbatch script, say "my_sbatch_script.sh":
#!/bin/bash -l
#SBATCH --account your_account_id
#SBATCH --mail-type ALL
#SBATCH --mail-user firstname.surname@unil.ch
#SBATCH --workdir /scratch/username/
#SBATCH --job-name my_deep_learning_job
#SBATCH --output my_deep_learning_job.out
#SBATCH --partition gpu
#SBATCH --gres gpu:1
#SBATCH --gres-flags enforce-binding
#SBATCH --nodes 1
#SBATCH --ntasks 1
#SBATCH --cpus-per-task 1
#SBATCH --mem 10G
#SBATCH --time 01:00:00
module load gcc cuda cudnn python/3.8.8
source $HOME/venv_keras_gpu/bin/activate
python /PATH_TO_YOUR_CODE/my_deep_learning_code.py
To launch your job:
cd $HOME/PATH_TO_YOUR_SBATCH_SCRIPT/
sbatch my_sbatch_script.sh
Multi-GPU parallelism
On the other hand, if you want to use 2 (or more) GPUs (on the same node), you need to tell Keras to use 2 GPUs. For that, you need to use a special Keras function, called "multi_gpu_model", in your python code "my_deep_learning_code.py":
from keras.utils import multi_gpu_model
parallel_model = multi_gpu_model(model, gpus=2)
This function implements single-machine multi-GPU data parallelism (so gpus >=2). It works in the following way: divide the input data into multiple sub-batches, apply a model copy on each sub-batch, where every model copy is executed on a dedicated GPU, and finally concatenate the results (on CPU) into one big batch. For example, if your batch_size is 64 and you use gpus=2, then we will divide the input data into 2 sub-batches of 32 samples, process each sub-batch on one GPU, then return the full batch of 64 processed samples. This induces quasi-linear speedup.
And the sbatch script must contain the line:
#SBATCH --gres gpu:2
TensorFlow
The installation of TensorFlow is the same as for Keras, except that you do not need to install Keras and that you may want to call your virtual environment "venv_tensorflow_gpu", so please look at the above Keras installation documentation.
Warning
In TensorFlow 1.15 and previous versions, the packages for CPU and GPU are offered separately:
pip install tensorflow==1.15 # CPU
pip install tensorflow-gpu==1.15 # GPU
PyTorch
To install the packages in your home directory:
cd $HOME
Log into a GPU node:
Sinteractive -p interactive -m 4G -G 1
Check that the GPU is visible:
nvidia-smi
Load parallel modules and python:
module purge
module load gcc cuda cudnn python/3.8.8
Create a virtual environment. Here we will call it "venv_pytorch_gpu", but you may choose another name:
virtualenv -p python venv_pytorch_gpu
Activate the virtual environment:
source venv_pytorch_gpu/bin/activate
Install PyTorch:
pip install torch
pip install torchvision
Check that PyTorch was properly installed:
python -c 'import torch; print(torch.__version__)'
There might be a warning message and the output should be something like "2.5.0".
You may install extra packages that you deep learning code will use. For example:
pip install sklearn
pip install pandas
pip install matplotlib
Deactivate your virtual environment and logout from the GPU node:
deactivate
exit
Comment
If you want to make your installation more reproducible, you may proceed as follows:
1. Create a file called "requirements.txt" and write the package names inside. You may also specify the package versions. For example:
torch==1.8.1
torchvision==0.9.1
sklearn==0.24.2
pandas==1.2.4
mathplotlib==3.4.2
2. Proceed as above, but instead of installing the packages individually, type
pip install -r requirements.txt
Run your deep learning code
To test your deep learning code (maximum 1h), say "my_deep_learning_code.py", you may use the interactive mode:
cd /scratch/username/
Sinteractive -p interactive -m 4G -G 1
module load gcc cuda cudnn python/3.8.8
source $HOME/venv_pytorch_gpu/bin/activate
Run your code:
python my_deep_learning_code.py
or copy/paste your code inside a python environment:
python
copy/paste your code
Once you have finished testing your code, you must close your interactive session (by typing exit), and then run it on the cluster by using an sbatch script, say "my_sbatch_script.sh":
#!/bin/bash -l
#SBATCH --account your_account_id
#SBATCH --mail-type ALL
#SBATCH --mail-user firstname.surname@unil.ch
#SBATCH --workdir /scratch/username/
#SBATCH --job-name my_deep_learning_job
#SBATCH --output my_deep_learning_job.out
#SBATCH --partition gpu
#SBATCH --gres gpu:1
#SBATCH --gres-flags enforce-binding
#SBATCH --nodes 1
#SBATCH --ntasks 1
#SBATCH --cpus-per-task 1
#SBATCH --mem 10G
#SBATCH --time 01:00:00
module load gcc cuda cudnn python/3.8.8
source $HOME/venv_pytorch_gpu/bin/activate
python /PATH_TO_YOUR_CODE/my_deep_learning_code.py
To launch your job:
cd $HOME/PATH_TO_YOUR_SBATCH_SCRIPT/
sbatch my_sbatch_script.sh
R Keras
R Keras is an interface to Python Keras. In simple terms, this means that the Keras R package allows you to enjoy the benefit of R programming while having access to the capabilities of the Python Keras package.
To install the packages in your home directory:
cd $HOME
Log into a GPU node:
Sinteractive -p interactive -m 4G -G 1
Check that the GPU is visible:
nvidia-smi
Load parallel modules and python:
module purge
module load gcc cuda cudnn python/3.8.8 r/4.0.4
Launch an R environment:
R
Install the R packages by using a virtual environment. Here we will call it "r-keras_gpu", but you may choose another name:
install.packages("keras")
Would you like to use a personal library instead? (yes/No/cancel) yes
Would you like to create a personal library to install packages into? (yes/No/cancel) yes
And select Switzerland for the CRAN mirror.
install.packages("keras")
library(keras)
install_keras(method="virtualenv", envname="r-keras_gpu", tensorflow="gpu")
q()
This will install Keras and TensorFlow.