Parallelisation¶
There are a few main approaches to parallelisation.
- Multithreaded application: A single program that runs multiple processes each with private memory allocation
- Job arrays: Several program instances that run in parallel
- MPI: A single program that runs multiple processes with shared memory; please contact us if you would like assistance using MPI
Multithreaded application¶
This is the simplest case. You only need to request the number of vCPUs that your application can take advantage of. In most cases, there is a parameter that you would need to specify when calling the application. The application's documentation should explain what parameter to use. You can find sample job scripts in the Job Submission section.
Job arrays¶
In general, job arrays are useful for applying the same processing routine to a collection of multiple input data files or different sets of parameters to the same input data files. Job arrays offer a very simple way to submit a large number of independent processing jobs. In this example, the --array=1-16
option will cause 16 array tasks (numbered 1, 2, ..., 16) to be spawned when this master job script is submitted. The array tasks are simply copies of this master script that are automatically submitted to the scheduler on your behalf. Thus, exactly the same amount of resources is requested for each array task. However, in each array task an environment variable called SLURM_ARRAY_TASK_ID
will be set to a unique value. In this example, the number will be in the range 1, 2, ..., 16. In your script, you can use this value to select, for example, a specific data file that each array task will be responsible for processing.
#!/bin/bash
#SBATCH --job-name=arrayJob
#SBATCH --output=arrayJob_%A_%a.out
#SBATCH --error=arrayJob_%A_%a.err
#SBATCH --array=1-16
#SBATCH --time=01:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=3850
srun your_application $SLURM_ARRAY_TASK_ID
Job array indices can be specified in a number of ways. For example:
- A job array with index values between 0 and 31
#SBATCH --array=0-31
-
A job array with index values of 1, 2, 5, 19, 27
#SBATCH --array=1,2,5,19,27
-
A job array with index values between 1 and 7 with a step size of 2 (i.e. 1, 3, 5, 7)
#SBATCH --array=1-7:2
In the example above, you can see that the output and error file names have special placeholders %A
and %a
. For each array tasks, they will be replaced with job ID and array task ID respectively.
Warning
Please make sure that each array task writes to its own unique set of files. For example, you can add SLURM_ARRAY_TASK_ID
as a file name suffix or append it to the output directory name.
Below is a sample script that runs a parameter sweep with a hypothetical application. It iterates over two parameters: alpha and gamma. The first parameter takes values from 0 to 6 with a step of 2 (i.e., 0, 2, 4, 6). If a step of 1 is desired, the step value can be omitted; e.g., {0..10}
. The second parameter can take values 'Aa', 'Bb', or 'Cc'. In total, there are 4 * 3 = 12
parameter combinations. So, the value for the --array
parameter should be 0-11
.
The nested loops generate two arrays that allow reconstruction of all parameter combinations. In this case, alphaArr
contains 0 0 0 2 2 2 4...
while gammaArr
has Aa Bb Cc Aa Bb Cc Aa...
. Later, $SLURM_ARRAY_TASK_ID
is used to retrieve a particular combination. The array indices start with 0. Thus, when the $SLURM_ARRAY_TASK_ID
variable is 2, for example, alpha
is set to 0 and gamma
takes the value Cc
, which are subsequently passed to the myapp
application.
The output file is saved to the results
directory. To make the output file unique, both parameter values are added to its name. For example, when $SLURM_ARRAY_TASK_ID
is 2, the output path will be results/output_a0_gCc.txt
. Note the use of curly braces (e.g., {}
) around the variable names. Since variable names can contain an underscore, without the braces bash would identify the first variable as $alpha_g
. The curly braces around gamma are technically redundant but they can help to prevent a potential bug if another variable is added to the file name.
The back slash (i.e., \
) is a line continuation character. There must be no spaces after back slashes.
Slurm exports several environment variables when it submits a job for execution. You may find SLURM_CPUS_PER_TASK
particularly useful. In this example, the variable is used to specify the number of threads available to the application.
#!/usr/bin/env bash
#SBATCH --time=1:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem-per-cpu=3850
#SBATCH --job-name=param_sweep
#SBATCH --output=param_sweep_%A_%a.out
#SBATCH --array=0-11
alphas=({0..6..2})
gammas=(Aa Bb Cc)
alphaArr=()
gammaArr=()
for alpha in "${alphas[@]}"; do
for gamma in "${gammas[@]}"; do
alphaArr+=($alpha)
gammaArr+=($gamma)
done
done
alpha=${alphaArr[$SLURM_ARRAY_TASK_ID]}
gamma=${gammaArr[$SLURM_ARRAY_TASK_ID]}
srun myapp \
--alpha=$alpha \
--gamma=$gamma \
--threads=$SLURM_CPUS_PER_TASK \
--output=results/output_a${alpha}_g${gamma}.txt
Before submitting the job, it would be helpful to test the script to ensure that it translates the task id correctly into the parameters. This can be done by setting $SLURM_ARRAY_TASK_ID
to the first command line argument before its first use and adding echo
before srun
.
# ...
SLURM_ARRAY_TASK_ID=$1
alpha=${alphaArr[$SLURM_ARRAY_TASK_ID]}
gamma=${gammaArr[$SLURM_ARRAY_TASK_ID]}
echo srun myapp \
--alpha=$alpha \
--gamma=$gamma \
--threads=$SLURM_CPUS_PER_TASK \
--output=results/output_a${alpha}_g${gamma}.txt
Suppose the job script has been saved as myjob
and added execution permissions chmod u+x myjob
. Now, when you call it with different ids it will print the application call string that would have been executed. Notice that you need to run the script directly without sbatch
.
./myjob 2
./myjob 5
./myjob 9
Caution
Make sure you remove both changes before submitting the job!