FAQs¶
What will happen to my queued jobs during maintenance?¶
When ScienceCluster maintenance occurs, Science IT admins will "drain" the ScienceCluster nodes so that the hardware and/or software used within the cluster can be updated. When a node is "drained", all currently running jobs will be allowed to finished and no additional jobs in the queue will be accepted to run. The maintenance will then be performed once a node has completed all running jobs (i.e., there is no activity on the node).
During this process, the SLURM queue will continue to hold all jobs with their assigned priority. As soon as the ScienceCluster maintenance window has closed, and the nodes are freed from their "drained" status, all jobs in the queue will continue to run normally.
Of note, it will not be possible to schedule jobs with time frames that overlap with a scheduled maintenance window on a node. These jobs will simply be rejected from the queue when you attempt submitting them via sbatch
. When this situation occurs, you should either adjust the time limit so it doesn't overlap with a maintenance window or simply submit the job(s) after the maintenance has been completed.
I am over-quota. How can I clean up my file storage?¶
Consider storing large files in your scalable storage folder, which is in your project space and can be found by running the command quota
.
Anaconda / Mamba¶
To clean up cached installation packages from Anaconda, run the following commands:
module load anaconda3
conda clean -a
pip cache purge
Or with Mamba:
module load mamba
mamba clean -a
pip cache purge
Singularity¶
Singularity stores its cache by default in a user's home folder. To determine your cache folder for Singularity:
echo $SINGULARITY_CACHEDIR
module load singularityce
singularity cache clean
You can change your singularity cache path with this command
export SINGULARITY_CACHEDIR=/scratch/$USER/
Or add it to your .bashrc
file so that it is set each time you log in.
echo "export SINGULARITY_CACHEDIR=/scratch/$USER/" >> ~/.bashrc
source ~/.bashrc
echo $SINGULARITY_CACHEDIR
Hidden folders¶
Your /home/$USER
and /data/$USER
folders contain hidden folders that start with the .
character. To list them, run:
ls -lrta
Folders that typically grow in size with cache or temporary files are .local
and .cache
. To find the storage used (in GB) in all hidden subfolders in your home and data, you can run this command:
for i in /home/$USER/.*/ /data/$USER/.*/ ; do echo -n -e "$i\t"; echo "`getfattr --absolute-names -h -n ceph.dir.rbytes --only-values $i` / 10^9" | bc; done
Framework folders¶
Certain software frameworks (e.g., HuggingFace) cache files programmatically, which can be cleaned with their own commands. For example, with HuggingFace consider using:
huggingface-cli delete-cache