Using Conda on the ScienceCluster¶

Conda is one of the supported environment management tools on ScienceCluster and can be loaded as a module. To make it available in your session run:

module load anaconda3

Alternatively you can use Mamba, a drop-in replacement for Conda that offers significantly faster performance, especially when solving complex dependency trees:

module load mamba

Info

Please refer to the generic guide Using Conda for instructions about using Conda environments.

Verify your `.condarc` file points to your data directory¶

Conda environments can generate many files of varying sizes, so it’s best to store them in your /data/$USER directory where you have ample space and faster SSD storage for many files. This setup is the default for new ScienceCluster users, but it’s good practice to verify your configuration.

To check your current .condarc settings:

cat ~/.condarc

If you need to reset or create a new .condarc file with the recommended settings run:

cat << EOF > ~/.condarc
channels:
  - conda-forge
  - defaults
envs_dirs:
  - /data/$USER/conda/envs
pkgs_dirs:
  - /data/$USER/conda/pkgs
auto_activate_base: false
EOF

This configuration ensures environments and packages are stored in your /data/$USER directory, which avoids running over-quota in your /home/$USER directory.

Workflow Overview¶

On ScienceCluster, we recommend the following workflow for integrating a Conda environment into your submission scripts:

Set Up Your Environment: To install the required packages, first start an interactive session. Then, create your Conda environment and install the necessary packages within that session.
Integrate Environment into SLURM Script: After installing the packages, exit the interactive session. On the login nodes, integrate the Conda environment into your SLURM script and submit the batch script from the login node.

Create your environment on ScienceCluster¶

To create a Conda environment, it’s best to work within an interactive session on a compute node. Doing so allows you to access greater resources than the login nodes:

srun --pty -n 1 -c 4 --time=01:00:00 --mem=16G bash -l
module load mamba
mamba create --name myenv python=3.13

Tip

If your environment will install GPU-accelerated packages, request a GPU compute node by adding --gpus=1 to your srun command. See our guide for more information and examples.

To activate the environment:

source activate myenv

Tip

On the ScienceCluster, you can ignore Conda's instruction to use conda activate. Use source activate instead.

After activation your shell prompt will show the environment name:

(myenv) <username>@<hostname>

With your environment active you’re ready to install packages, run your applications, or begin your analysis.

Submitting a sbatch job with a Conda environment¶

Below is an example sbatch script for submitting a non-interactive job using an existing Conda environment:

#!/usr/bin/bash -l

#SBATCH --time=0-00:10:00    # runtime limit (D-HH:MM:SS), here 10 minutes
#SBATCH --mem=4G             # memory per node
#SBATCH --ntasks=1           # number of tasks (usually 1 for most jobs)
#SBATCH --cpus-per-task=1    # number of CPUs per task

module load mamba
source activate myenv

python3 /data/$USER/mycode.py