Using Conda on the ScienceCluster¶

While Conda-based environment management on ScienceCluster can be accomplished through the miniforge3 module, we strongly recommend users containerize their worklows.

Doing so has worthwhile benefits:

It complies with FAIR principles.
It allows singularizes an entire software environment into a single file.
Perhaps most importantly, it ensures greater efficiency with your own ScienceCluster workflows due to its interaction with the distributed filesystem.

To began with your environment creation, load the apptainer module with:

module load apptainer

Info

Please refer to the generic guide Using Conda for non-specific instructions about using Conda environments.

Workflow Overview¶

On ScienceCluster, we recommend the following workflow for integrating a Conda environment into your submission scripts:

Set Up Your Environment: To install the required packages, first start an interactive session. Then create your Conda env.yml file and use it to build a container with the specified software.
Integrate Environment into SLURM Script: After installing the packages, exit the interactive session. On the login nodes, integrate the Conda environment container into your SLURM script and submit the batch script from the login node.

Create your environment on ScienceCluster¶

For a basic example on how to create a virtual environment, you can refer to the relevant section in the ScienceCluster training.

For additional guidance for specific use-cases, please refer to the sections below.

Creating your environment from a `env.yml`¶

First, run an interactive session:

srun --pty -n 1 -c 2 --time=00:30:00 --mem=7G bash -l

Then create a Conda env.yml file with your required packages; for this example install python and pandas:
```
name: venv
channels:
  - conda-forge
dependencies:
  - python=3.14
  - pandas
```

Create a Singularity definition file (e.g., conda.def) used to build a container using the env.yml:

Bootstrap: docker
From: condaforge/miniforge3:latest

%files
    ./env.yml /env.yml

%post
    mamba env create --yes -f /env.yml
    mamba clean --all -f -y

%environment
    export PATH="/opt/conda/envs/venv/bin:$PATH"

%runscript
    exec python "$@"

Note the following details about the .def file:
- The From: value is condaforge/miniforge3:latest, which gives access to the latest version of Conda.
- The %files section imports the env.yml from your present working directory into the container (located at /env.yml within the container filesystem).
- The %post section creates the Conda/Mamba environment inside of the container (using the imported env.yml).
- The %environment section ensures the Conda environment is available by default when using the container.
  - Important: the export PATH="/opt/conda/envs/venv/bin:$PATH" specifies the name of the environment (venv) from the env.yml file.
  - Make sure to update this value accordingly for your own env.yml environment name (i.e., the character string between /envs/ and /bin:$PATH).
- The %runscript section allows you to call apptainer exec ... with your container to run an arbitrary script of your choice.

Finally, build the container using the env.yml and the conda.def:

module load apptainer
APPTAINER_BINDPATH="" apptainer build conda.sif conda.def

Pip Packages¶

Packages from pip can be added in the designated pip section of the env.yml file:

    name: venv
    channels:
      - conda-forge
    dependencies:
      - python=3.14
      - pandas
      - pip
      - pip:
          - your_pip_package

Installing from private GitHub repos¶

In case you need to install packages from private GitHub repositories, you can integrate a GitHub token into your container build workflow. Include the token in the %post section:

Bootstrap: docker
From: condaforge/miniforge3:latest

%files
    ./env.yml /env.yml

%post
    # For private GitHub Repos
    export GITHUB_TOKEN=<your_generated_token_here>

    mamba env create --yes -f /env.yml
    mamba clean --all -f -y

%environment
    export PATH="/opt/conda/envs/venv/bin:$PATH"

%runscript
    exec python "$@"

Then your env.yml file should look something like this:

name: venv
channels:
  - conda-forge
dependencies:
  - python=3.14
  - pandas
  - pip
  - pip:
      - git+https://${GITHUB_TOKEN}@github.com/example/example.git

Installing system packages¶

In case Conda/Mamba packages require additional system-level packages that are not included in the condaforge/miniforge3:latest image, you can add them within the %post section. For example,

Bootstrap: docker
From: condaforge/miniforge3:latest

%files
    ./env.yml /env.yml

%post
    # Install an example package using `apt`
    apt update && apt install -y libglib2.0-0

    mamba env create --yes -f /env.yml
    mamba clean --all -f -y

%environment
    export PATH="/opt/conda/envs/venv/bin:$PATH"

%runscript
    exec python "$@"

Note that the -y flag is used to automatically accept package installation prompts.

For more information, see the container documentation page.

Submitting a sbatch job with a Conda virtual environment container¶

Below is an example sbatch script for submitting a non-interactive job using an existing Conda environment:

#!/usr/bin/bash -l

#SBATCH --time=0-00:10:00    # runtime limit (D-HH:MM:SS), here 10 minutes
#SBATCH --mem=4G             # memory per node
#SBATCH --ntasks=1           # number of tasks (usually 1 for most jobs)
#SBATCH --cpus-per-task=1    # number of CPUs per task

# Load the apptainer module
module load apptainer

# The syntax is `apptainer exec <script_command> <optional_arguments>`
apptainer exec container.sif python ./myscript.py arg1 arg2