Skip to content

Quick start

The goal of this short tutorial is to introduce new users to the ScienceCluster environment. It assumes that the readers already have experience with remote Linux servers but not necessarily with clusters. If you have never worked with remote servers before, you may want to start with the detailed instructions instead.

Connecting to the cluster

The cluster can be reached via ssh at cluster.s3it.uzh.ch. The load balancer redirects the requests in round-robin fashion to one of the several login nodes. The username that you use is your UZH Active Directory (AD) shortname. The password in most cases will be the same as your Email/Collaboration password. If you are unable to log in using your Email/Collaboration password, you will need to update your AD password in the Identity Manager.

ssh shortusername@cluster.s3it.uzh.ch

Detailed instructions...

Data storage

There are three filesystems where you can store your data.

Your home filesystem (/home/cluster/<shortname>) has a quota of 15 GB / 100,000 files. Typically, it is used to store configuration and small important files.

For persistent storage of larger files, you can use the data filesystem (~/data or /data/<shortname>). It has a limit of 200 GB but it is not backed up. This filesystem is also appropriate for software installation (e.g., Python modules or R packages).

Tip

If you need additional space for persistent data beyond the data filesystem, you can request scalable storage. It is not subject to quota but it requires cost contributions based on the actual usage.

Large input data and computational results can be stored on the scratch filesystem (~/scratch or /scratch/<shortname>), which has a quota of 20 TB and is not backed up. Please note that this filesystem is meant for temporary storage and the files may be automatically deleted if they have not been accessed within one month.

More information...

Partitions

Our cluster has been partitioned according to its hardware capabilities. The partitions are as follows:

  • generic: jobs requiring at most 32 vCPUs and/or 123 GB of RAM.
  • hpc: jobs requiring a high speed inter-connect or high CPU/memory (> 32 vCPUs or > 123 GB RAM per job).
  • hydra: jobs requiring more than 377 GB of RAM.
  • vesta: jobs requiring GPUs (equipped with Nvidia K80 cards).
  • volta: jobs requiring GPUs (equipped with Nvidia V100 cards).

You can switch to a specific partition by loading one of the partition modules. For example, the following command selects the generic partition.

module load generic

No partition is selected by default. So, if you do not load a partition module and do not specify the partition explicitly as an sbatch parameter, your job will be rejected. You can see the list of available partitions by listing all available modules with module avail or module av. Partitions will be listed in the section titled /sapps/etc/modules/start. The following command can be used to display the partitions that you can access.

sacctmgr show assoc format=partition,account%20,qos%30 user=<username>

After loading a partition, you can also use module av command to see the list of software available on that partition.

Click here for more detailed information about the partitions.

Job scheduling

Jobs are submitted with the sbatch command. The default values for resource allocations are very low. If you do not specify any parameters, Slurm (the automatic job allocation system) will allocate 1 vCPU, 1 MB of memory, and 1 second for execution time. Therefore, you need to specify at minimum the amount of memory and the expected runtime. For example, to run a hostname command on the cluster, you can create a file named test.job with the following contents:

#!/usr/bin/env bash
hostname

Then you can submit it for execution with the following command (assuming that you have already loaded a partition module).

sbatch --time=0:10:0 --mem=7800 --cpus-per-task=2 test.job

This will request 2 CPUs and 7800 MB of RAM for 10 minutes. Alternatively, you can specify these parameters in your job file; e.g.,

#!/usr/bin/env bash
#SBATCH --time=0:10:0
#SBATCH --mem=7800
#SBATCH --cpus-per-task=2
hostname

The memory per vCPU ratio is the same on all nodes of a particular partition. It is 4 GB/vCPU on generic, 8 GB/vCPU on hpc, and 24 GB/vCPU on hydra. However, Slurm has to reserve some memory for system operations. Therefore, the optimal hardware utilisation can only be achieved when a smaller amount of memory is requested per vCPU, namely 3850 for generic, 8000 for hpc, and 24100 for hydra. If you request multiples of those numbers with --mem or the exact numbers with --mem-per-cpu, you might be able to schedule more jobs to run in parallel than with 4G, 8G, or 24G on generic, hpc, and hydra respectively.

For testing or debugging purposes, you can run your job in an interactive session. Any other use of interactive sessions is generally discouraged. You can start an interactive session with the following command.

srun --pty --time=1:0:0 --mem-per-cpu=8000 --cpus-per-task=2 bash -l

For more detailed information on job submission, click here.

Maximum running time

You should strive to split your calculations into jobs that can finish in fewer than 24 hours. Short jobs are easier to schedule; i.e., they are likely to start earlier than long jobs. If something goes wrong, you might be able to detect it earlier. In case of a failure, you will be able to restart calculations from the last checkpoint rather than from the beginning. Finally, long jobs fill up the queue for extended periods and prevent other users from running their smaller jobs.

A job's runtime is controlled by the --time parameter. If your job runs beyond the specified time limit, Slurm will terminate it. Depending on the value of the --time parameter, slurm automatically places jobs into one of the quality of service (QOS) groups, which in turn affects job scheduling priority as well as some other limits and properties. ScienceCluster has four different QOS groups.

  • normal: 24 hours
  • medium: 48 hours
  • long (vesta for GPU partitions): 7 days
  • verylong: 28 days

In order to be able to use the verylong QOS (i.e., running times over 7 days), please request access via the UZH Help Desk(mentioning S3IT in the subject line). A single user can run only one job with the verylong QOS at a time. If you schedule multiple verylong jobs, they will run serially regardless of the resource availability.

Job management

You can view the list of currently scheduled and running jobs with the squeue command. Without any parameters, it will display all the jobs that are currently scheduled or running on the cluster. If you loaded a partition module, then the output will be limited to the jobs scheduled or running on that particular partition. To see only your jobs, you need to specify the -u parameter.

squeue -u <username>

If you want to delete a job from the queue you can do so with scancel, and you need to specify the Job ID as an argument. The Job ID is always reported when you schedule a job. You can also find it in the output of squeue. Multiple jobs can be deleted at once. For example,

scancel 2850610 2850611

You can also cancel all your jobs at once without specifying and Job IDs. The following two commands delete all your jobs or all your pending jobs, respectively.

scancel
scancel --state=PENDING

For more information about job management, click here.

Parallelisation

There are four main approaches to parallelisation.

  • Single program that runs multiple processes each with private memory allocation
  • Several program instances that run in parallel (i.e., job arrays)
  • Single master program that launches several slave programs
  • Single program that runs multiple processes with shared memory (MPI)

For the first approach, you do not need to do anything special. You just submit a job requesting the number of vCPUs that your program can efficiently use. The other three approaches are described in the Job Scheduling section of the documentation.

External resources

In addition to the documentation provided on this site, you can also find the following external resources useful.


Last update: March 3, 2022