How to do deep learning using custom Jupyter kernels on Sherlock

2020-02-10 434 words 3 minutes

Contents

A recipe for interactive computing using custom Jupyter kernels on Stanford’s Sherlock.

1. Download and install Miniconda

1
2
3
4
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
# install
bash Miniconda3-latest-Linux-x86_64.sh 
conda config --set always_yes yes 

2. Install jupyter notebook/lab and secure your notebooks with a password

1
2
3
4
# install the default py3 kernel for jupyter notebook
conda install ipython jupyter notebook jupyterlab
# add password
jupyter notebook password

3. (Optional) Add custom conda environment. i.e. fastai

1
2
3
4
5
conda create -n fastai ipython ipykernel 
# add the custom to Jupyter notebook
conda activate fastai
python -m ipykernel install --user --name fastai --display-name FastAI

you could also add R, Julia etc kernel.

4. Install pytorch/tensorflow

You should select the existed cuda version which installed in Sherlock

1
conda install -c pytorch pytorch torchvision cudatoolkit=10.1 

tensorflow

1
conda install tensorflow-gpu cudatoolkit=10.1

5. Load gpu modules. Select the corresponding cuda version you’ve just installed

1
2
3
4
# this is my version
module load cuda/10.1.168
module load cudnn/7.6.4
module load nccl

6. now, open ipython, run

1
2
import torch
print(torch.cuda.is_avilable())

if print out is True, then you’er OK to use GPUs.

Follow these steps on your local machine

see details here.

7. Download the `forward` repo

1
2
git clone https://github.com/vsoch/forward
cd forward

8. Generate your parameters

1
bash setup.sh

Select Sherlock partition: gpu

9. SSH Credentials

1
bash hosts/sherlock_ssh.sh >> ~/.ssh/config

10. create a sbatch script in forward/sbatches/sherlock and save as `jupyter-gpu.sbatch`

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#!/bin/bash

PORT=$1
NOTEBOOK_DIR=$2
if [ -z "$NOTEBOOK_DIR" ]; then
    cd $SCRATCH
else
    cd $NOTEBOOK_DIR
fi

## to compile libtorch C++ code, load these modules
# module load gcc/7.3.0
# module load gdb
# module load cmake
# export CC=$(which gcc)
# export CXX=$(which g++)

# select cuda version you need
module load cuda/10.1.168
module load cudnn/7.6.4
module load nccl

# activate fastai env 
source activate fastai 
jupyter lab --no-browser --port=$PORT

11. Start a session

The default working directory is $SCRATCH

1
bash start.sh jupyter-gpu

change the working directory

1
bash start.sh jupyter /path/to/dir

12. open your browser in local machine and type

if your port is 51888, then

1
http://localhost:51888/

here is my jupyter lab computing environment. Have fun!

fastai kernel

Test GPUs

13. Resume a session

1
2
3
bash resume.sh jupyter-gpu
# or
bash resume.sh jupyter-gpu /path/to/dir

14. Stop a session

1
2
3
bash end.sh jupyter-gpu
# or
bash end.sh jupyter-gpu /path/to/dir