Storage¶

Storage Overview¶

There are four complementary filesystems where you can store your data.

Name	Path	Alias	Backup	Purged	Usage	Disks
home	`/home/$USER`	`~`	No	No	15 GB, 100k files (limit)	Redundant, SSD
data	`/data/$USER`	`~/data`	No	No	200 GB (limit)	Redundant, SSD
scratch	`/scratch/$USER`	`~/scratch`	No	30 days	20 TB (limit)	Redundant, HDD
scalable storage	`/shares/<PROJECT>`	see below	No	No	with cost contribution (quota) to increase quota, contact us	Redundant, HDD

Filesystems¶

Tip

To see an overview of your storage usage, run the command quota from a login node.

Warning

If you exceed the quota, you will not be able to write to that file system. Check the FAQs for support on how to clean up your file storage if you are over-quota.

Home¶

Each user has a home directory where configuration files, source code, and other small important files can be stored. The directory has a limit of 100,000 files and/or 15 GB of used space. The quota makes it impractical for large data storage or software installations.

Data¶

For persistent storage of larger files, you can use the data filesystem (~/data or /data/$USER). It has a limit of 200 GB and it is not backed up (as is the case also for the other storage). This filesystem is also appropriate for software installations.

Scratch¶

The scratch filesystem (~/scratch or /scratch/$USER) is for the temporary storage of large input data files used during your calculations. Each user has a quota of 20 TB. The maximum file size is limited to 10 TB. Please note that this filesystem is meant for temporary storage only. According to the service agreement, any files older than 30 days are subject to deletion.

Scalable Storage¶

Scalable group storage requires a cost contribution and is based on the actual usage. The default permissions are set so that each member of the project has access to the shared folder, which can be found at this path: /shares/<PROJECT>. (In this case, replace <PROJECT> with your actual project name.)

You can create a symlink called shares in your home directory that points to this shared group folder:

ln -s /shares/<PROJECT> ~/shares

Further Storage Options¶

UZH Central IT (ZI) offers further storage options. For more details, please check General Topics → Data Storage.

Data Archiving
Data Publishing
Network Storage (SMB- and NFS-based) with high availability and backup
- For connection instructions to SMB share, please check article How to connect to a UZH NAS from the ScienceCluster.
- Connection instructions to NFS share will follow.

FAQs¶

I am over-quota. How can I clean up my file storage?¶

Consider storing large files in your scalable storage folder, which is in your project space and can be found by running the command quota.

Folders that typically grow in size with cache or temporary files are .local and .cache. To find the storage used (in GB) in all subfolders in your /home/$USER and /data/$USER folders, run:

ls -lha

In addition, you may want to check the number of files in your /home/$USER directory with:

cat /home/$USER

The total number of files and directories will be shown as rentries and it may not exceeds 100,000.

If you cannot login anymore into the cluster you can still connect using terminal only:

ssh -t <shortname>@cluster.s3it.uzh.ch bash

and then remove files or kill processes if needed.

Conda¶

To clean up cached installation packages from Conda environments, see here.

Singularity¶

Check Singularity cache set up and clean up instructions in this article.

Framework folders¶

Certain software frameworks (e.g., HuggingFace) cache files programmatically, which can be cleaned with their own commands. For example, with HuggingFace consider using:

huggingface-cli delete-cache

You can find corresponding information on such commands in the framework specific documentation, such as this page for information on HuggingFace cache cleaning.