How to use Conda environments on the ScienceCluster¶
Conda is one of the software environment management tools offered on the ScienceCluster. If you need a suite of tools in a specific software language (e.g., R or Python), Conda can help you manage this environment for both portability to other systems and reproducibility.
The example script for Python shows the basics of using a Conda environment, with parallel steps outlined below.
.condarc file in your home directory¶
Because Conda environments write many different files of various sizes when you install new environments, it's best to locate all of your Conda environments within the
/data directory of the file system.
To do so, before you create your first environment you should first write a
.condarc file to your home directory. This file will tell Anaconda where to locate all of your environments and their packages when they are created.
Use the following template for your
# These two flags determine the default location of conda environments and pkgs envs_dirs: - /data/<username>/conda/envs pkgs_dirs: - /data/<username>/conda/pkgs
To create the file in your home directory, run
nano .condarc immediately after logging in to the Cluster (i.e., in your home directory). Then, paste the contents of the template into the
nano editor, replacing
<username> with your UZH shortname. To exit
nano and save the file, follow the instructions at the bottom of the
nano screen. (The usual commands to quit and save are
y when prompted to save a
Save modified buffer, then
Create your environment¶
Before beginning, load the Anaconda module in the cluster using:
module load generic # You can also load another module other than generic module load anaconda3
Then, following the example script for Python, the command used to create a Conda environment is:
conda create --name tensorflowexample python=3.9
⚠️ If you do not specify a version of Python when you create the environment, the system's default Python will be used. If you install additional packages using Conda, a newer version of Python may be installed and then made default in the environment.
Make sure to provide a suitable name for each environment so that you can keep track of them. To activate this newly created and empty environment, use:
source activate tensorflowexample
⚠️ Conda will direct you to use
conda activate. Make sure to use
source activate instead.
If this occurs successfully, you will see the name of the environment prepended to your command line prompt, like so:
login0 may be another login node number, such as
login1; this is expected behavior.
Once you have your Conda environment loaded, you can proceed with installing packages of interest. For example, the TensorFlow example uses:
conda install tensorflow-gpu
Always try to use
conda install first when attempting to install packages and/or software. This command will search through Conda channels for the package(s) of interest. You can specify additional Conda channels using the
-c flag; for more details on this flag, see the
conda install Documentation.
Once your packages are installed in your environment, and the environment is activated, you'll be able to access those packages on the cluster. As such, all workflows depending on these packages will need to have the Conda environment activated at runtime. See the example script for Python for a demonstration of such a workflow.
To deactivate your Conda environment and go back to the default ScienceCluster system environment, use:
If your Conda software environment requires the use of packages that can only be installed using
pip, consider these best practices when doing so.