Skip to content

How to use Conda environments on the ScienceCluster

Conda is one of the software environment management tools offered on the ScienceCluster. If you need a suite of tools in a specific software language (e.g., R or Python), Conda can help you manage this environment for both portability to other systems and reproducibility.

The example script for Python shows the basics of using a Conda environment, with parallel steps outlined below.

Conda Instructions

Create a .condarc file in your home directory

Because Conda environments write many different files of various sizes when you install new environments, it's best to locate all of your Conda environments within the /data/$USER directory of the file system.

To do so, before you create your first environment you should first write a .condarc file to your home directory. This file will tell Anaconda where to locate all of your environments and their packages when they are created.

Note

Users are currently equipped with a default .condarc in their home directory. You can confirm the contents of this file by running cat ~/.condarc. If you don't see any output when running this command, continue as directed below to create a new .condarc file.

Use the following command to create a new file, or to overwrite an existing file: /home/$USER/.condarc

cat << EOF > /home/$USER/.condarc
# These two flags determine the default location of conda environments and pkgs
envs_dirs:
  - /data/$USER/conda/envs
pkgs_dirs:
  - /data/$USER/conda/pkgs
EOF

Create your environment

Before beginning, load the Anaconda module in the cluster using:

module load anaconda3

Then, following the example script for Python, the command used to create a Conda environment is:

conda create --name myenv python=3.10

Note

⚠️ If you do not specify a version of Python when you create the environment, the system's default Python will be used. If you install additional packages using Conda, a newer version of Python may be installed and then made default in the environment.

Make sure to provide a suitable name for each environment so that you can keep track of them. To activate this newly created and empty environment, use:

source activate myenv

Note

⚠️ Conda will direct you to use conda activate. Make sure to use source activate instead.

If this occurs successfully, you will see the name of the environment prepended to your command line prompt, like so:

(myenv) <username>@login0

Note: login0 may be another login node number, such as login1; this is expected behavior.

Once you have your Conda environment loaded, you can proceed with installing packages of interest.

conda install numpy

Always try to use conda install first when attempting to install packages and/or software. This command will search through Conda channels for the package(s) of interest. You can specify additional Conda channels using the -c flag; for more details on this flag, see the conda install Documentation.

Once your packages are installed in your environment, and the environment is activated, you'll be able to access those packages on the cluster. As such, all workflows depending on these packages will need to have the Conda environment activated at runtime. See the example script for Python for a demonstration of such a workflow.

To deactivate your Conda environment and go back to the default ScienceCluster system environment, use:

conda deactivate

Best practices

If your Conda software environment requires the use of packages that can only be installed using pip, consider these best practices when doing so.

Mamba

Mamba is a package manager that is fully compatible with Conda but performs faster than Conda on certain tasks. You can use Mamba instead of Conda by loading the mamba module and using the mamba command in place of conda.


Last update: March 17, 2023