Skip to content

How to use Conda environments on the ScienceCluster

Conda is one of the software environment management tools offered on the ScienceCluster. If you need a suite of tools in a specific software language (e.g., R or Python), Conda can help you manage this environment for both portability to other systems and reproducibility.

The example script for Python shows the basics of using a Conda environment, with parallel steps outlined below.

Conda Instructions

Create a .condarc file in your home directory

Because Conda environments write many different files of various sizes when you install new environments, it's best to locate all of your Conda environments within the /data directory of the file system.

To do so, before you create your first environment you should first write a .condarc file to your home directory. This file will tell Anaconda where to locate all of your environments and their packages when they are created.

Use the following template for your .condarc file.

# These two flags determine the default location of conda environments and pkgs
envs_dirs:
  - /data/<username>/conda/envs
pkgs_dirs:
  - /data/<username>/conda/pkgs

To create the file in your home directory, run nano .condarc immediately after logging in to the Cluster (i.e., in your home directory). Then, paste the contents of the template into the nano editor, replacing <username> with your UZH shortname. To exit nano and save the file, follow the instructions at the bottom of the nano screen. (The usual commands to quit and save are ^X, then y when prompted to save a Save modified buffer, then return/enter.)

Create your environment

Before beginning, load the Anaconda module in the cluster using:

module load generic
# You can also load another module other than generic
module load anaconda3

Then, following the example script for Python, the command used to create a Conda environment is:

conda create --name tensorflowexample python=3.9

Note

⚠️ If you do not specify a version of Python when you create the environment, the system's default Python will be used. If you install additional packages using Conda, a newer version of Python may be installed and then made default in the environment.

Make sure to provide a suitable name for each environment so that you can keep track of them. To activate this newly created and empty environment, use:

source activate tensorflowexample

Note

⚠️ Conda will direct you to use conda activate. Make sure to use source activate instead.

If this occurs successfully, you will see the name of the environment prepended to your command line prompt, like so:

(tensorflowexample) <username>@login0

Note: login0 may be another login node number, such as login1; this is expected behavior.

Once you have your Conda environment loaded, you can proceed with installing packages of interest. For example, the TensorFlow example uses:

conda install tensorflow-gpu

Always try to use conda install first when attempting to install packages and/or software. This command will search through Conda channels for the package(s) of interest. You can specify additional Conda channels using the -c flag; for more details on this flag, see the conda install Documentation.

Once your packages are installed in your environment, and the environment is activated, you'll be able to access those packages on the cluster. As such, all workflows depending on these packages will need to have the Conda environment activated at runtime. See the example script for Python for a demonstration of such a workflow.

To deactivate your Conda environment and go back to the default ScienceCluster system environment, use:

conda deactivate

Best practices

If your Conda software environment requires the use of packages that can only be installed using pip, consider these best practices when doing so.


Last update: January 19, 2022