How to run Guppy on ScienceCluster¶
S3IT is unable to offer system-wide Guppy installation on ScienceCluster because ONT provides it under severely restrictive terms and conditions. However, you might be able to run Guppy on the cluster as a customer of ONT if you accept their terms and conditions. We strongly recommend that you read the Terms & Conditions to ensure that your use of the software would not violate them.
How-to guide¶
To run Guppy on the cluster, you would need to build a Singularity container.
Install Singularity¶
If you are running Linux, you can install Singularity and build it locally on your computer. While it is possible to run Singularity under MacOS or Windows, the process might be rather cumbersome and it would probably be easier to create a ScienceCloud instance instead. The following process assumes that you are using Ubuntu and may need to be adjusted for other Linux flavours.
If you are familiar with Conda, you can use it to install a recent version of singularity. Otherwise, you can install Singularity by following the instructions that can be found in the Singularity Admin Guide.
Prepare definition file¶
Singularity images are typically built from a definition file that describes various container properties and indicates what software to install.
Create a directory for building the image¶
cd
mkdir build && cd build
Copy the following template of the definition file into guppy.def¶
Bootstrap: library
# As of 2020-05-20 guppy requires ubuntu16
From: ubuntu:16.04
%help
This is a container that runs guppy.
%labels
guppy
%environment
LANG=C.UTF-8
LC_ALL=C.UTF-8
export LANG LC_ALL
%test
# Exits with 255 (!?!)
# guppy_aligner --version
guppy_aligner -h
# Exits with 255 (!?!)
# guppy_barcoder --version
guppy_barcoder -h
guppy_basecaller --version
guppy_basecaller -h
%post
export DEBIAN_FRONTEND=noninteractive
# update system and install prerequisites
apt-get -qq update && apt-get -qq install -y --no-install-recommends gnupg \
lsb-release \
curl \
wget \
apt-transport-https \
zlib1g-dev \
tar \
bzip2 \
gzip \
xz-utils \
unzip \
ca-certificates \
libcuda1-384
# Place Guppy installation commands below
# cleanup
rm -rf /tmp/downloaded\_packages/ && \
rm -rf /tmp/*.rds && \
rm -rf /var/lib/apt/lists/*
Get Guppy protocol¶
Find Guppy protocol on the ONT website (login required), go to the Linux section of the protocol and find the subsection that describes the installation from a .deb file. Paste those commands into the definition file below the line that says # Place Guppy installation commands below
. Currently, there are two blocks in the instructions. The first block add the repository, the second block installs the package. Please make sure that the second block is for the GPU version of Guppy.
Adjust the installation commands.¶
Warning
If you do not make these changes, the image may fail to build or you will not be able to run it on the cluster.
- Remove the first two commands from the first command block, i.e.
apt-get update
andapt-get install
. They are already included in the template. - Remove
sudo
from all commands. There are multiple instances ofsudo
on some lines. All of them should be removed. - Find the Guppy install command and add
--no-install-recommends -y
flags after the wordinstall
. This will prevent the installation of the NVIDIA driver, which is already installed on the cluster.
Build the image¶
singularity build guppy-3.6.1.sif guppy.def
Transfer the image to the cluster¶
The command below will save it to your data directory. Please feel free to change the target directory to something else. Note, however, that the image is large and it would be better not to store it in your home directory.
scp -p guppy-3.6.1.sif <shortname>@cluster.s3it.uzh.ch:data
Expand the image¶
The image needs to be expanded to prevent automatic expansion in the future. From this point on, all operations are performed on the cluster. It might be good to create a separate directory to store your containers.
# Log out of the ScienceCloud instance you used to build the image and log into ScienceCluster
cd ~/data
mkdir containers
module load singularity
singularity build --sandbox containers/guppy-3.6.1 guppy-3.6.1.sif
Test the expanded image¶
srun --time=0:5:0 --gres=gpu:1 --pty /bin/bash
singularity exec -u --nv $HOME/data/guppy-3.6.1 guppy_basecaller --version
exit
Remove the .sif file¶
rm guppy-3.6.1.sif
Loading the Singularity module¶
When submitting a job, you would always need to load the Singularity module. Then you can run Guppy commands as in the test command above. You can specify multiple Guppy commands in the same batch file but you have to use 'singularity exec' with each one. For example, you script may look as follows.
#!/usr/bin/env bash
#SBATCH --time=10:00:00 --gres=gpu:1
module load singularity
singularity exec -u --nv $HOME/data/guppy-3.6.1 guppy_barcoder ...
singularity exec -u --nv $HOME/data/guppy-3.6.1 guppy_basecaller ...