Skip to content

How to use Conda environments on ScienceCluster

Conda is one of the software environment management tools offered on ScienceCluster. If you need a suite of tools in a specific software language (e.g., R or Python), Conda can help you manage this environment for both portability to other systems and reproducibility.

The example script for Python shows the basics of using a Conda environment, with parallel steps outlined below.

Conda Instructions

Verify your .condarc file points to your data directory

Because Conda environments write many different files of various sizes when you install new environments, it's best to locate all of your Conda environments within the /data/$USER directory of the file system.

Conda uses a configuration file in a user's home folder, ~/.condarc, to assign the location for packages and environments.

You can confirm the contents of this file by running cat ~/.condarc. If you don't see any output when running this command, continue as directed below to create a new .condarc file. If you see that your .condarc file is not using the /data/$USER/ paths, please run the command as well.

To create a new /home/$USER/.condarc file, or to overwrite an existing file, run this command:

cat << EOF > /home/$USER/.condarc
# These two flags determine the default location of conda environments and pkgs
envs_dirs:
  - /data/$USER/conda/envs
pkgs_dirs:
  - /data/$USER/conda/pkgs
EOF

Create your environment

Before beginning, load the Anaconda module. We recommend to use an interactive session as creating an environment requires non-negligible resources and therefore can affect the other users of the login nodes.

# feel free to adapt it based on your needs
srun --pty -n 1 -c 2 --time=00:30:00 --mem=4G bash -l
module load anaconda3

Then, following the example script for Python, the command used to create a Conda environment is:

conda create --name myenv python=3.12

Note

⚠️ If you do not specify a version of Python when you create the environment, the system's default Python will be used. If you install additional packages using Conda, a newer version of Python may be installed and then made default in the environment.

Make sure to provide a suitable name for each environment so that you can keep track of them. To activate this newly created and empty environment, use:

source activate myenv

Note

⚠️ Conda will direct you to use conda activate. Make sure to use source activate instead.

If this occurs successfully, you will see the name of the environment prepended to your command line prompt, like so:

(myenv) <username>@login0

Note: login0 may be another login node number, such as login1; this is expected behavior.

Once you have your Conda environment loaded, you can proceed with installing packages of interest.

conda install numpy

Always try to use conda install first when attempting to install packages and/or software. This command will search through Conda channels for the package(s) of interest. You can specify additional Conda channels using the -c flag; for more details on this flag, see the conda install Documentation.

Once your packages are installed in your environment, and the environment is activated, you'll be able to access those packages on the cluster. As such, all workflows depending on these packages will need to have the Conda environment activated at runtime. See the example script for Python for a demonstration of such a workflow.

To deactivate your Conda environment and go back to the default ScienceCluster system environment, use:

conda deactivate

Best practices

If your Conda software environment requires the use of packages that can only be installed using pip, consider these best practices when doing so.

Mamba

Mamba is a package manager that is fully compatible with Conda but performs faster than Conda on certain tasks. You can use Mamba instead of Conda by loading the mamba module and using the mamba command in place of conda.