Skip to content

How to run Guppy on the ScienceCluster

S3IT is unable to offer system-wide Guppy installation on the ScienceCluster because ONT provides it under severely restrictive terms and conditions. However, you might be able to run Guppy on the cluster as a customer of ONT if you accept their terms and conditions. We strongly recommend that you read the Terms & Conditions to ensure that your use of the software would not violate them.

How-to guide

To run Guppy on the cluster, you would need to build a Singularity container.

Install Singularity

If you are running Linux, you can install Singularity and build it locally on your computer. While it is possible to run Singularity under MacOS or Windows, the process might be rather cumbersome and it would probably be easier to create a ScienceCloud instance instead. The following process assumes that you are using Ubuntu and may need to be adjusted for other Linux flavours.

If you are familiar with Conda, you can use it to install a recent version of singularity. Otherwise, you can install Singularity by following the instructions that can be found in the Singularity Admin Guide.

Prepare definition file

Singularity images are typically built from a definition file that describes various container properties and indicates what software to install.

Create a directory for building the image

cd
mkdir build && cd build

Copy the following template of the definition file into guppy.def

Bootstrap: library
# As of 2020-05-20 guppy requires ubuntu16
From: ubuntu:16.04

%help
This is a container that runs guppy.

%labels
guppy

%environment
LANG=C.UTF-8
LC_ALL=C.UTF-8
export LANG LC_ALL

%test
# Exits with 255 (!?!)
# guppy_aligner --version
guppy_aligner -h
# Exits with 255 (!?!)
# guppy_barcoder --version
guppy_barcoder -h
guppy_basecaller --version
guppy_basecaller -h

%post
export DEBIAN_FRONTEND=noninteractive

# update system and install prerequisites
apt-get -qq update && apt-get -qq install -y --no-install-recommends gnupg \
    lsb-release \
    curl \
    wget \
    apt-transport-https \
    zlib1g-dev \
    tar \
    bzip2 \
    gzip \
    xz-utils \
    unzip \
    ca-certificates \
    libcuda1-384

# Place Guppy installation commands below

# cleanup
rm -rf /tmp/downloaded\_packages/ && \
rm -rf /tmp/*.rds && \
rm -rf /var/lib/apt/lists/*

Get Guppy protocol

Find Guppy protocol on the ONT website (login required), go to the Linux section of the protocol and find the subsection that describes the installation from a .deb file. Paste those commands into the definition file below the line that says # Place Guppy installation commands below. Currently, there are two blocks in the instructions. The first block add the repository, the second block installs the package. Please make sure that the second block is for the GPU version of Guppy.

Adjust the installation commands.

Warning

If you do not make these changes, the image may fail to build or you will not be able to run it on the cluster.

  1. Remove the first two commands from the first command block, i.e. apt-get update and apt-get install. They are already included in the template.
  2. Remove sudo from all commands. There are multiple instances of sudo on some lines. All of them should be removed.
  3. Find the Guppy install command and add  --no-install-recommends -y flags after the word install. This will prevent the installation of the NVIDIA driver, which is already installed on the cluster.

Build the image

singularity build guppy-3.6.1.simg guppy.def

Transfer the image to the cluster

The command below will save it to your data directory. Please feel free to change the target directory to something else. Note, however, that the image is large and it would be better not to store it in your home directory.

scp -p guppy-3.6.1.simg myusername@cluster.s3it.uzh.ch:data

Expand the image

The image needs to be expanded to prevent automatic expansion in the future. From this point on, all operations are performed on the cluster. It might be good to create a separate directory to store your containers.

# Log out of the ScienceCloud instance you used to build the image and log into ScienceCluster
cd ~/data
mkdir containers
module load vesta singularity
singularity build --sandbox containers/guppy-3.6.1 guppy-3.6.1.simg

Test the expanded image

srun --time=0:5:0 --gres gpu:1 --pty /bin/bash
singularity exec -u --nv -H /net/cephfs/home/$USER -B /scratch:/scratch -B /data:/data -B /home/cluster/$USER:/home/cluster/$USER $HOME/data/guppy-3.6.1 guppy_basecaller --version
exit

Remove the .simg file

rm guppy-3.6.1.simg

Loading the Singularity module

When submitting a job, you would need to load the singularity module in addition to the vesta/volta modules. Then you can run Guppy commands as in the test command above. You can specify multiple Guppy commands in the same batch file but you have to use 'singularity exec' with each one. For example, you script may look like this (assuming that you load the vesta or volta module before you submit it.

#!/usr/bin/env bash
#SBATCH --time=10:00:00 --gres gpu:1
module load singularity

singularity exec -u --nv -H /net/cephfs/home/$USER \
   -B /scratch:/scratch -B /data:/data -B /home/cluster/$USER:/home/cluster/$USER \
   $HOME/data/guppy-3.6.1 guppy_barcoder ...

singularity exec -u --nv -H /net/cephfs/home/$USER \
   -B /scratch:/scratch -B /data:/data -B /home/cluster/$USER:/home/cluster/$USER \
   $HOME/data/guppy-3.6.1 guppy_basecaller ...

Last update: July 30, 2021