High performance computing - HPC

Ce service permet d’accéder aux infrastructures de calcul haute performance (clusters) de l’UNIL pour le traitement de données de recherche non sensibles.

Getting Started

Getting Started

DCSR? Kesako?

The full name is the Division de Calcul et Soutien à la Recherche / Computing and Research Support unit

The mission of the DCSR is to supply the University of Lausanne with compute and storage capabilities for all areas of research.

As well as managing compute and storage systems we also provide user support:

The official DCSR homepage is at: https://www.unil.ch/ci/dcsr-en

Getting Started

How to access the clusters

The DCSR maintains a general purpose cluster (Curnagl) which is described here.  Researchers needing to process sensitive data must use the air gapped cluster Urblauna which has replaced  Jura

There are several requirements to be able to connect to the clusters:

  1. Have a UNIL account
  2. To be part of a PI project
  3. To be on the UNIL or CHUV network (either physically or using the UNIL VPN if you work remotely)
  4. To have a SSH client

Step 0: Have a UNIL account

This applies to members of the CHUV community as well as for external collaborators

See the documentation for how to get a UNIL account

CHUV users should also consult https://www.unil.ch/ci/ui/ext-hosp for more information.

Step 1: Be part of a PI project

To access the clusters, your PI will first need to request resources via: https://conference.unil.ch/research-resource-requests/. Then the PI must add you as a member of one of his project. Within 24 hours your access should be granted.

Step 2: Activate the UNIL VPN

Unless you are physically within the UNIL network you need to activate the UNIL VPN (Crypto). Documentation to install and run it can be found here.

Step 3: Open a SSH client

On Linux and Mac environments, a SSH client should be available by default. You simply need to open a terminal.

Windows users can either use PowerShell if they are on Windows 10, or install a third party client such as PuTTy or MobaXterm.

Step 4: Log into the cluster

Curnagl

ssh -X <username>@curnagl.dcsr.unil.ch

where <username> is your UNIL username name. You will have to enter your UNIL password.

Note: we strongly recommend you to establish SSH keys to connect to the clusters and to protect your SSH keys with a passphrase.

More details are available regarding the different clients in this documentation.

Urblauna

See the Urblauna documentation

Getting Started

I'm a PI and would like to use the clusters - what do I do?

It's easy! Please fill in the request form at https://conference.unil.ch/research-resource-requests/ and we'll get back in touch with you as soon as possible.

Help!

Help!

How do I ask for help?

Before asking for help please take the time to check that your question hasn't already been answered in our FAQ.

To contact us please send an e-mail to the UNIL Helpdesk at helpdesk@unil.ch starting the subject with DCSR 

From:    user.lambda@unil.ch

To:      helpdesk@unil.ch

Subject: DCSR Cannot run CowMod on Curnagl

Dear DCSR,
I am unable to run the CowMod code on Curnagl - please see job number 1234567 for example.

The error message is "No grass left in field - please move to alpage"

You can find my input in /users/ulambda/CowMod/tests/hay/

To reproduce the issue on the command line the following recipe works (or rather doesn't work)

module load CowMod
cd /users/ulambda/CowMod/tests/hay/
CowMod --input=Feedtest 

Thanks

Dr Lambda

It helps us if you can provide all relevant information including how we can reproduce the problem and a Job ID if you submitted your task via the batch system.

Once we have analysed your problem we will get in touch with you.

Help!

Recovering deleted files?

This depends on where the file was and when it was created and deleted.

/scratch

There is no backup and no snapshots so the file is gone forever. 

/users

If it was in your home directory /users/<username>  then you can recover files from up to 7 days ago using the built-in snapshots by navigating to the snapshot directory as follows:

[ulambda@login ~]$ pwd
/users/ulambda

[ulambda@login ~]$ date
Tue Jun  1 13:59:28 CEST 2021

[ulambda@login ~]$ $ cd /users/.snapshots/

[ulambda@login .snapshots]$ ls
2021-05-26  2021-05-27  2021-05-28  2021-05-29  2021-05-30  2021-05-31  2021-06-01

[ulambda@login .snapshots]$ cd 2021-05-31/ulambda

[ulambda@login ]$ pwd
/users/.snapshots/2021-05-31/ulambda

[ulambda@login ]$ ls
..
my_deleted_file_from_yesterday
..
..


 

The snapshots are taken at around 3am in the morning so if you created a file in the morning and deleted it the same afternoon then we can't help.

Beyond 7 days the file is lost forever.

Infrastructure and Resources

Infrastructure and Resources

Curnagl

Kesako?


Curnagl (Romanche), or Chocard à bec jaune in French, is a sociable bird known for its acrobatic exploits and is found throughout the alpine region. More information is available at https://www.vogelwarte.ch/fr/oiseaux/les-oiseaux-de-suisse/chocard-a-bec-jaune

It's also the name of the HPC cluster managed by the DCSR for the UNIL research community. 

A concise description if you need to describe the cluster is:

Curnagl is a 96 node HPC cluster based on AMD Zen2/3 CPUs providing a total of 4608 compute cores and 54TB of memory. 8 machines are equipped with 2 A100 GPUs and all nodes have 100Gb/s HDR Infiniband and 100Gb/s Ethernet network connections in a fat-tree topology. The principal storage is a 2PB disk backed filesystem and a 150TB SSD based scratch system. Additionally all nodes have 1.6 TB local NVMe drives.

If you experience unexpected behaviour or need assistance please contact us via helpdesk@unil.ch starting the mail subject with DCSR Curnagl


How to connect

The login node is curnagl.dcsr.unil.ch

For full details on how to connect using SSH please read the documentation


Before connecting we recommend that you add the host's key to your list of known hosts:

echo "curnagl.dcsr.unil.ch ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBCunvgFAN/X/8b1FEIxy8p3u9jgfF0NgCl7CX4ZmqlhaYis2p7AQ34foIXemaw2wT+Pq1V9dCUh18mWXnDsjGrg=" >> ~/.ssh/known_hosts

You can also type "yes" during the first connection to accept the host key but this is less secure.

Please be aware that you must be connected to the VPN if you are not on the campus network.

Then simply ssh username@curnagl.dcsr.unil.ch where username is your UNIL account

The login node must not be used for any form of compute or memory intensive task apart from software compilation and data transfer. Any such tasks will be killed without warning.

Hardware

Compute

The cluster is composed of 96 compute nodes of which eight have GPUs. 

Number of nodes Memory CPU GPU
52 512 GB 2 x AMD Epyc2 7402 -
12 1024 GB 2 x AMD Epyc2 7402 -
8 512 GB 2 x AMD Epyc2 7402 2 x NVIDIA A100
24 512 GB 2 x AMD Epyc3 7443


Network

The nodes are connected with both HDR Infiniband and 100 Gb Ethernet. The Infiniband is the primary interconnect for storage and inter-node communication.

Partitions 

There are 3 main partitions on the cluster:

interactive

The interactive partition allows rapid access to resources but comes with a number of restrictions, the main ones being:

For example:

CPU cores requested Memory requested GPUs requested Run Time Allowed
4 32 1 8 hours
8 64 1 4 hours
16 128 1 2 hours
32 256 1 1 hour

We recommend that users access this using the Sinteractive command. This partition should also be used for compiling codes.

This partition can also be accessed using the following sbatch directive:

#SBATCH -p interactive 

Note on GPUs in the interactive partition

There is one node with GPUs in the interactive partition and in order to allow multiple users to work at the same time these A100 cards have been partitioned into 2 instances each with 20GB of memory for a total of 4 GPUs. 

The maximum time limit for requesting a GPU is 8 hours with the CPU and memory limits applying. 

For longer jobs and to have whole A100 GPUs please submit batch jobs to the gpu partition.

Please do not block resources if you are not using them as this prevents other people from working.

If you request too many resources then you will see the following error:

salloc: error: QOSMaxCpuMinutesPerJobLimit
salloc: error: Job submit/allocate failed: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits)

Please reduce either the time or the cpu / memory / gpu requested.

cpu

This is the main partition and includes the majority of the compute nodes. Interactive jobs are not permitted. The partition is configured to prevent long running jobs from using all available resources and to allow multi-node jobs to start within a reasonable delay.

The limits are:

Normal jobs - 3 days

Short jobs - 12 hours

Normal jobs are restricted to ~2/3 of the resources which prevents the cluster being blocked by long running jobs.

In exceptional cases wall time extensions may be granted but for this you need to contact us with a justification before submitting your jobs!

The cpu partition is the default partition so there is no need to specify it but if you wish to do so then use the following sbatch directive

#SBATCH -p cpu

gpu

This contains the GPU equipped nodes. 

To request resources in the gpu partition please use the following sbatch directive:

#SBATCH -p gpu

The limits are:

Normal jobs - 3 days

Short jobs - 12 hours

Normal jobs are restricted to ~2/3 of the resources which prevents the cluster being blocked by long running jobs.

--gres=gpu:N

where N is 1 or 2. 

Software

For information on the DCSR software stack see the following link:

https://wiki.unil.ch/ci/books/high-performance-computing-hpc/page/dcsr-software-stack


Storage

The storage is provided by a Lenovo DSS system and the Spectrum Scale (GPFS) parallel filesystem.

/users

Your home space is at /users/username and there is a per user quota of 50 GB and 100,000 files.

We would like to remind you that all scripts and code should be stored in a Git repository.

/scratch

The scratch filesystem is the primary working space for running calculations.

The scratch space runs on SSD storage and has an automatic cleaning policy so in case of a shortage of free space files older than 2 weeks (starting with the oldest first) will be deleted.

Initially this cleanup will be triggered if the space is more than 90% used and this limit will be reviewed as we gain experience with the usage patterns.

The space is per user and there are no quotas (*). Your scratch space can be found at /scratch/username 

e.g. /scratch/ulambda

Use of this space is not charged for as it is now classed as temporary storage.

* There is a quota of 50% of the total space per user to prevent runaway jobs wreaking havoc

/work

The work space is for storing data that is being actively worked on as part of a research project. Projects have quotas assigned and while we will not delete data in this space there is no backup so all critical data must also be kept on the DCSR NAS.

The structure is: 

/ work / FAC / FACULTY / INSTITUTE / PI / PROJECT

This space can, and should, be used for the installation of any research group specific software tools including python virtual environments.



Infrastructure and Resources

Curnagl - 2022

Following the migration to the CCT datacenter there are a number of things that have changed that you should be aware of:


New login node

When you first connect to curnagl.dcsr.unil.ch you will receive a warning that the host key has changed and you will  not be allowed to connect.

Please remove the old host key for curnagl.dcsr.unil.ch in your .ssh/known_hosts (ssh-keygen -R curnagl.dcsr.unil.ch) file and reconnect .

The new login node is identical to the compute nodes (it is a compute node) but as previously it should not be used for running calculations.


New software stack

The slightly delayed 2022 DCSR software stack is now in production and includes more recent compilers as well as new versions of packages and libraries.

For more information see https://wiki.unil.ch/ci/books/high-performance-computing-hpc/page/dcsr-software-stack

The old software stack remains available although no new packages will be added to it.

To switch between software stacks there is the new dcsrsoft tool:

# Show which stack is being used

[ulambda@curnagl ~]$ dcsrsoft show
Running with Prod

# Switch to the 2021 stack

[ulambda@curnagl ~]$ dcsrsoft use old
Switching to the old software stack

# Switch to the unsupported Vital-IT software stack

[ulambda@curnagl ~]$ dcsrsoft use vitalit
Switching to the distant past

# Switch back to the 2022 stack

[ulambda@curnagl ~]$ dcsrsoft use prod
Switching to the prod software stack

The dcsrsoft command is a bash function and it should be executed on the fronted node. In order to use an old stack on a job, you need to execute the commands above before launching your job using sbatch.

More disk space

Soon the available disk space will be doubled with 2PB available for /work


More nodes

Once the migration is complete there will be an additional 24 compute nodes bringing the total to 96 machines of which 12 have 1TB of memory and 8 have A100 GPUs.


Infrastructure and Resources

Storage on Curnagl

Where is data stored

This storage is accessible from within the UNIL network using the SMB/CIFS protocol. It is also accessible on the cluster login node at /nas (see this guide)

The UNIL HPC clusters also have dedicated storage that is shared amongst the compute nodes but this is not, in general, accessible outside of the clusters  except via file transfer protocols (scp).

This space is intended for active use by projects and is not a long term store.

Cluster filesystems

The cluster storage is based on the IBM Spectrum Scale (GFPS) parallel filesystem. There are two disk based filesystems (users and work) and one SSD based one (scratch). Whilst there is no backup the storage is reliable and resilient to disk failure. 

The role of each filesystem as well as details of the data retention policy is given below.

How much space am I using?

For the users and work filesystems the quotacheck command allows you to see the used and allocated space:

 

[ulambda@login ~]$ quotacheck 
### Work Quotas ###
 
Project: pi_ulambda_100111-pr-g 
 
                         Block Limits                                    |     File Limits
Filesystem type         blocks      quota      limit   in_doubt    grace |    files   quota    limit in_doubt    grace  Remarks
work       FILESET      304.6G     1.999T         2T          0     none |  1107904 9990000 10000000        0     none DCSR-DSS.dcsr.unil.ch
 
 
Project: gruyere_100666-pr-g 
 
                         Block Limits                                    |     File Limits
Filesystem type         blocks      quota      limit   in_doubt    grace |    files   quota    limit in_doubt    grace  Remarks
work       FILESET           0        99G       100G          0     none |        1  990000  1000000        0     none DCSR-DSS.dcsr.unil.ch

 
### User Quota ###
 
                         Block Limits                                    |     File Limits
Filesystem type         blocks      quota      limit   in_doubt    grace |    files   quota    limit in_doubt    grace  Remarks
users      USR          8.706G        50G        51G       160M     none |    66477  102400   103424      160     none DCSR-DSS.dcsr.unil.ch

Users

/users/<username>

This is your home directory and can be used for storing small amounts of data. The per user quota is 50 GB and 100,000 files.

There are daily snapshots kept for seven days in case of accidental file deletion. See here for more details. 

Work

/work/<path to my project>

This space is allocated per project and the quota can be increased on request by the PI as long as free space remains. 

This space is not backed up but there is no over-allocation of resources so we will never ask you to remove files.

Scratch

/scratch/<username>

The scratch space is for intermediate files and the results of computations. There is no quota and the space is not charged for. You should think of it as temporary storage for a few weeks while running calculations.

In case of limited space files will be automatically deleted to free up space. The current policy is that if the usage reaches 90% files, starting with the oldest first, will be removed until the occupancy is reduced to 70%. No files newer than two weeks old will be removed.

$TMPDIR

For certain types of calculation it can be useful to use the NVMe drive on the compute node. This has a capacity of ~400 GB and can be accessed inside a batch job by using the $TMPDIR variable.

At the end of the job this space is automatically purged.

 

 

Infrastructure and Resources

Jura

Jura is a cluster for the analysis of sensitive data and is primarily used by the CHUV.

The Jura cluster is replaced by Urblauna 

Computing ressources

Storage ressources

ATTENTION /data directory is NOT BACKED UP

Getting ressources on Jura

Accessing the infrastructure from UNIL

ATTENTION PROPER LOG OUT

Transferring data in

sib-1-24:~ someuser$ sftp someuser@jura.dcsr.unil.ch Password: Verification code: Connected to someuser@jura.dcsr.unil.ch. sftp> dir data  sftp> cd data sftp> dir sftp> put AVeryImportantFile.tgz Uploading AVeryImportantFile.tgz to /data/AVeryImportantFile.tgz AVeryImportantFile.tgz

Transferring code in/out


There is a DCSR managed Git service accessible from Jura. More information can be found at

https://wiki.unil.ch/ci/books/service-de-calcul-haute-performance-%28hpc%29/page/why-is-there-a-dcsr-gitlab-service-and-what-is-it


Accessing the infrastructure from CHUV

ssh<unil-username>@stockage-horus.chuv.ch


Infrastructure and Resources

Urblauna

Kesako?

Urblauna (Romanche), or Lagopède Alpin in French, is a bird known for its changing plumage which functions as a very effective camouflage. More information is available at https://www.vogelwarte.ch/fr/oiseaux/les-oiseaux-de-suisse/lagopede-alpin

It's also the name of our new sensitive data compute cluster which will replace the Jura cluster.

Information on how to connect to Urblauna can be found here.

Information on the Jura to Urblauna migration can be found here

The differences between Jura and Urblauna are described here

Hardware

Compute

The cluster is composed of 18 compute nodes of which two have GPUs. All have the same 24 core processor.

Number of nodesMemoryCPUGPU
161024 GB2 x AMD Epyc3 7443-
21024 GB2 x AMD Epyc3 74432 x NVIDIA A100
The GPUs are partitioned to create 4 GPUs on each machine with 20GB of memory per GPU

Storage

The storage is based on IBM Spectrum Scale / Lenovo DSS and provides 1PB of space in the /data filesystem.

Whilst reliable this space is not backed up and all important data should also be stored on /archive

The Curnagl /work filesystem is visible in read-only mode on Urblauna and can be used to install software on an internet connected system before using it on Urblauna.

**Filesystem mount point** **Description**
/usersUrblauna home directory
/scratchUrblauna scratch space (automatic cleanup)
/dataUrblauna data space (no backup)
/archiveSecure data space with backup (login node access only)
/workCurnagl data space (read only)
/jura_homeJura home directories (read only, login node only)
/jura_dataJura data space (read only, login node only)

Software

For information on the DCSR software stack see the following link:

https://wiki.unil.ch/ci/books/high-performance-computing-hpc/page/dcsr-software-stack

Slurm partitions

On Urblauna there are two partitions - "urblauna" and "interactive"

$ sinfo

PARTITION   AVAIL  TIMELIMIT  NODES  STATE NODELIST
urblauna*      up 3-00:00:00     17   idle sna[002-016],snagpu[001-002]
interactive    up    8:00:00      4   idle sna[015-016],snagpu[001-002]

There is no separate GPU partition so to use a GPU simply request

#SBATCH --gres=gpu:1

To launch an interactive session you can use Sinteractive as on Curnagl

Using the Clusters

Using the Clusters

How to run a job on Curnagl

Overview

Suppose that you have finished writing your code, say a python code called <my_code.py>, and you want to run it on the cluster Curnagl. You will need to submit a job (a bash script) with information such as the number of CPUs you want to use and the amount of RAM memory you will need. This information will be processed by the job scheduler (a software installed on the cluster) and your code will be executed. The job scheduler used on Curnagl is called SLURM (Simple Linux Utility for Resource Management). It is a free open-source software used by many of the world’s computer clusters.

The partitions

The clusters contain several partitions (sets of compute nodes dedicated to different means). To list them, type

sinfo

As you can see, there are three partitions:

Each partition is associated with a submission queue. A queue is essentially a waiting line for your compute job to be matched with an available compute resource. Those resources become available once a compute job from a previous user is completed.

Note that the nodes may be in different states: idle=not used, alloc=used, down=switch off, etc. Depending on what you want to do, you should choose the appropriate partition/submission queue.

The sbatch script

To execute your python code on the cluster, you need to make a bash script, say <my_script.sh>, specifying the information needed to run your python code (you may want to use nano, vim or emacs as an editor on the cluster). Here is an example:

#!/bin/bash -l

#SBATCH --account project_id 
#SBATCH --mail-type ALL 
#SBATCH --mail-user firstname.surname@unil.ch

#SBATCH --chdir /scratch/<your_username>/
#SBATCH --job-name my_code 
#SBATCH --output my_code.out

#SBATCH --partition cpu

#SBATCH --cpus-per-task 8 
#SBATCH --mem 10G 
#SBATCH --time 00:30:00 
#SBATCH --export NONE

module load python

python3 /PATH_TO_YOUR_CODE/my_code.py

Here we have used the command "module load python" before "python3 /PATH_TO_YOUR_CODE/my_code.py" to load some libraries and to make several programs available.

To display the list of available modules or to search for a package:

module avail
module spider package_name

For example, to load bowtie2:

module load bowtie2/2.4.2

To display information of the sbatch command, including the SLURM options:

sbatch --help
sbatch --usage

Finally, you submit the bash script as follows:

sbatch my_script.sh

Important: We recommend to store the above bash script and your python code in your home folder, and to store your main input data in your work space. The data may be read from your python code. Finally you must write your results in your scratch space.

To show the state (R=running or PD=pending) of your jobs, type:

Squeue

If you realize that you made a mistake in your code or in the SLURM options, you may cancel it:

scancel JOBID

An interactive session

Often it is convenient to work interactively on the cluster before submitting a job. I remind you that when you connect to the cluster you are actually located at the front-end machine and your must NOT run any code there. Instead you should connect to a node by using the Sinteractive command as shown below.


[ulambda@login ~]$ Sinteractive -c 1 -m 8G -t 01:00:00
 
interactive is running with the following options:

-c 1 --mem 8G -J interactive -p interactive -t 01:00:00 --x11

salloc: Granted job allocation 172565
salloc: Waiting for resource configuration
salloc: Nodes dna020 are ready for job
[ulambda@dna020 ~]$  hostname
dna020.curnagl

You can then run your code.

Hint: If you are having problems with a job script then copy and paste the lines one at a time from the script into an interactive session - errors are much more obvious this way.

You can see the available options by passing the -h option.

[ulambda@login1 ~]$ Sinteractive -h
Usage: Sinteractive [-t] [-m] [-A] [-c] [-J]

Optional arguments:
    -t: time required in hours:minutes:seconds (default: 1:00:00)
    -m: amount of memory required (default: 8G)
    -A: Account under which this job should be run
    -R: Reservation to be used
    -c: number of CPU cores to request (default: 1)
    -J: job name (default: interactive)
    -G: Number of GPUs (default: 0)

To logout from the node, simply type:

exit

Embarrassingly parallel jobs

Suppose you have 14 configuration files in <path_to_configurations> and you want to process them in parallel by using your python code <my_code.py>. This is an example of embarrassingly parallel programming where you run 14 independent jobs in parallel, each with a different set of parameters specified in your configuration files. One way to do it is to use an array type:

#!/bin/bash -l

#SBATCH --account project_id 
#SBATCH --mail-type ALL 
#SBATCH --mail-user firstname.surname@unil.ch 

#SBATCH --chdir /scratch/<your_username>/
#SBATCH --job-name my_code 
#SBATCH --output=my_code_%A_%a.out

#SBATCH --partition cpu
#SBATCH --ntasks 1

#SBATCH --cpus-per-task 8 
#SBATCH --mem 10G 
#SBATCH --time 00:30:00 
#SBATCH --export NONE

#SBATCH --array=0-13

module load python/3.9.13

FILES=(/path_to_configurations/*)

python /PATH_TO_YOUR_CODE/my_code.py ${FILES[$SLURM_ARRAY_TASK_ID]}

The above allocations (for example time=30 minutes) is applied to each individual job in your array.

Similarly, if the configuration files are simple numbers:

#!/bin/bash -l

#SBATCH --account project_id 
#SBATCH --mail-type ALL 
#SBATCH --mail-user firstname.surname@unil.ch 

#SBATCH --chdir /scratch/<your_username>/
#SBATCH --job-name my_code 
#SBATCH --output=my_code_%A_%a.out

#SBATCH --partition cpu 
#SBATCH --ntasks 1

#SBATCH --cpus-per-task 8 
#SBATCH --mem 10G 
#SBATCH --time 00:30:00 
#SBATCH --export NONE

#SBATCH --array=0-13

module load python/3.9.13

ARGS=(0.1 2.2 3.5 14 51 64 79.5 80 99 104 118 125 130 100)

python /PATH_TO_YOUR_CODE/my_code.py ${ARGS[$SLURM_ARRAY_TASK_ID]}

Another way to run embarrassingly parallel jobs is by using one-line SLURM commands. For example, this may be useful if you want to run your python code on all the files with bam extension in a folder: 

for file in `ls *.bam`
do
  sbatch --account project_id --mail-type ALL --mail-user firstname.surname@unil.ch 
  --chdir /scratch/<your_username>/ --job-name my_code --output my_code-%j.out --partition cpu 
  --nodes 1 --ntasks 1 --cpus-per-task 8 --mem 10G --time 00:30:00 
  --wrap "module load gcc/9.3.0 python/3.8.8; python /PATH_TO_YOUR_CODE/my_code.py $file" &
done

MPI jobs

Suppose you are using MPI codes locally and you want to launch them on Curnagl. 

The below example is a slurm script running an MPI code  mpicode (which can be either of C, python, or fortran type...) on one single node (i.e. --nodes 1) using NTASKS cores without using multi-threading (i.e. --cpus-per-task 1). In this example, the memory required is 32Gb in total. To run an MPI code, the loading modules are gcc and mvapich2 only. You must add needed modules (depending on your code) behind those two.

Instead of mpirun command, you must use srun command, which is the equivalent command to run MPI codes on a cluster. To know more about srun, go through srun --help documentation.

#!/bin/bash -l 

#SBATCH --account project_id  
#SBATCH --mail-type ALL  
#SBATCH --mail-user firstname.surname@unil.ch  

#SBATCH --chdir /scratch/<your_username>/ 
#SBATCH --job-name testmpi 
#SBATCH --output testmpi.out 

#SBATCH --partition cpu 
#SBATCH --nodes 1  
#SBATCH --ntasks NTASKS 
#SBATCH --cpus-per-task 1 
#SBATCH --mem 32G  
#SBATCH --time 01:00:00  

module purge
module load mvapich2/2.3.7  

srun mpicode 

For a complete MPI overview on Curnagl, please refer to compiling and running MPI codes wiki.

Good practice



Using the Clusters

What projects am I part of and what is my default account?

In order to find out what projects you are part of on the clusters then you can use the Sproject tool:

$ Sproject 

The user ulambda ( Ursula Lambda ) is in the following project accounts
  
   ulambda_default
   ulambda_etivaz
   ulambda_gruyere
 
Their default account is: ulambda_default

If Sproject is called without any arguments then it tells you what projects/accounts you are in. 

To find out what projects other users are in you can call Sproject with the -u option

$ Sproject -u nosuchuser

The user nosuchuser ( I really do not exist ) is in the following project accounts
..
..

 

Using the Clusters

Providing access to external collaborators

In order to allow non UNIL collaborators to use the HPC clusters there are three steps which are detailed below.

Please note that the DCSR does not accredit external collaborators as this is a centralised process.

The procedures for different user groups are explained at https://www.unil.ch/ci/ui

  1. The external collaborator must first obtain an EduID via www.eduid.ch
  2. The external collaborator must ask for a UNIL account using this form. The external collaborator must give the name of the PI in the form (The PI is "sponsoring" the account)
  3. the PI to whom the external collaborator is connected must use this application to add the collaborator into the appropriate project. Log into the application if necessary on the top right, and click on the "Manage members list / Gérer la liste de membres" icon for your project. The usernames always have 8 characters (e.g. Greta Thunberg username would be: gthunber)
  4. the external collaborator needs to use the UNIL VPN:

    https://www.unil.ch/ci/fr/home/menuinst/catalogue-de-services/reseau-et-telephonie/acces-hors-campus-vpn/documentation.html

The external collaborator on the VPN can then login to the HPC cluster as if he was inside the UNIL.

Using the Clusters

Requesting and using GPUs

GPU Nodes

Both Curnagl and Urblauna have nodes with GPUs - on Curnagl these are in a separate partition.

Curnagl

Currently there are 7 nodes each with 2 NVIDIA A100 GPUs. One additional node is in the interactive partition 

Urblauna

Currently there are 2 nodes each with 2 NVIDIA A100 GPUs. The GPUs are partitioned into 2 GPUs with 20GB of memory so it appears that each node had 4 distinct GPUs. These GPUs are also available interactively. 

Requesting GPUs

In order to access the GPUs they need to be requested via SLURM as one does for other resources such as CPUs and memory. 

The flag required is --gres=gpu:1 for 1 GPU per node and --gres=gpu:2 for 2 GPUs per node. 

 An example job script is as follows:

#!/bin/bash -l

#SBATCH --cpus-per-task 12
#SBATCH --mem 64G
#SBATCH --time 12:00:00

# GPU partition request only for Curnagl 
#SBATCH --partition gpu

#SBATCH --gres gpu:1
#SBATCH --gres-flags enforce-binding

# Set up my modules

module purge
module load my list of modules
module load cuda

# Check that the GPU is visible

nvidia-smi

# Run my GPU enable python code

python mygpucode.py 

If the #SBATCH --gres gpu:1 is omitted then no GPUs will be visible even if they are present on the compute node. 

If you request one GPU it will always be seen as device 0.

The #SBATCH --gres-flags enforce-binding option ensures that the CPUs allocated will be on the same PCI bus as the GPU(s) which greatly improves the memory bandwidth. This may mean that you have to wait longer for resources to be allocated but it is strongly recommended.

Partitions

The #SBATCH --partition can take different options depending on whether you are on Curnagl or on Urblauna.

Curnagl:

Urblauna:

Using CUDA

In order to use the CUDA toolkit there is a module available

module load cuda

This loads the nvcc compiler and CUDA libraries. There is also a cudnn module for the DNN tools/libraries 


Containers and GPUs

Singularity containers can make use of GPUs but in order to make them visible to the container environment an extra flag "--nv" must be passed to Singularity

module load singularity

singularity run --nv mycontainer.sif

The full documentation is at https://sylabs.io/guides/3.5/user-guide/gpu.html


Using the Clusters

How do I run a job for more that 3 days?

The simple answer is that you can't without special authorisation. Please do not submit such jobs and ask for a time extension!

If you think that you need to run for longer than 3 days then please do the following:

Contact us via helpdesk@unil.ch and explain what the problem is.

We will then get in touch with you to analyse your code and suggest performance or workflow improvements to either allow it to complete within the required time or to allow it to be run in steps using checkpoint/restart techniques.

Recent cases involve codes that were predicted to take months to run now finishing in a few days after a bit of optimisation.

If the software cannot be optimised, there is the possibility of using a checkpoint mechanism. More information is available on the checkpoint page

Using the Clusters

Access NAS DCSR from the cluster

The NAS is available from the login node only under /nas. The folder hierarchy is:

/nas/FAC/<your_faculty>/<your_department>/<your_PI>/<your_project>

Cluster -> NAS

To copy a file to the new NAS:

cp /path/to/file /nas/FAC/<your_faculty>/<your_department>/<your_PI>/<your_project>

To copy a folder to the new NAS:

cp -r /path/to/folder /nas/FAC/<your_faculty>/<your_department>/<your_PI>/<your_project>

For more complex operations, consider using rsync. For the documentation see the man page:

man rsync

or check out this link.

NAS -> cluster

As above, just swapping the source and destination:

cp /nas/FAC/<your_faculty>/<your_department>/<your_PI>/<your_project>/file /path/to/dest
cp -r /nas/FAC/<your_faculty>/<your_department>/<your_PI>/<your_project>/folder /path/to/dest
Using the Clusters

SSH connection to DCSR cluster

This page presents how to connect to DCSR cluster depending on your operating system.

Linux

SSH is always installed by most commons Linux distributions, so no extra package should be installed.

Connection with a password

To connect using a password, just run the following command:

ssh username@curnagl.dcsr.unil.ch

Of course, replace username in the command line with your UNIL login, and use your UNIL password.

Connection with a key

To connect with a key, you first have to generate the key on your laptop. This can be done as follows:

ssh-keygen -t ed25519
Generating public/private ed25519 key pair.
Enter file in which to save the key (/home/ejeanvoi/.ssh/id_ed25519): /home/ejeanvoi/.ssh/id_dcsr_cluster
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/ejeanvoi/.ssh/id_dcsr_cluster
Your public key has been saved in /home/ejeanvoi/.ssh/id_dcsr_cluster.pub
The key fingerprint is:
SHA256:8349RPk/2AuwzazGul4ki8xQbwjGj+d7AiU3O7JY064 ejeanvoi@archvm
The key's randomart image is:
+--[ED25519 256]--+
|                 |
|    .            |
|     + .       . |
|    ..=+o     o  |
|     o=+S+ o . . |
|     =*+oo+ * . .|
|    o *=..oo Bo .|
|   . . o.o.oo.+o.|
|     E..++=o   oo|
+----[SHA256]-----+

By default, it suggests you to create the private key to ~/.ssh/id_ed25519 and the public key to to ~/.ssh/id_ed25519.pub. You can hit "Enter" when the question is asked if you don't use any other key. Otherwise, you can choose another path, for instance: ~/.ssh/id_dcsr_cluster like in the example above.

Then, you have to enter a passphrase (twice). This is optional but you are strongly encouraged to choose a strong passphrase.

Once the key is created, you have to copy the public to the cluster. This can be done as follows:

[ejeanvoi@archvm ~]$ ssh-copy-id -i /home/ejeanvoi/.ssh/id_dcsr_cluster ejeanvoi@curnagl.dcsr.unil.ch
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/ejeanvoi/.ssh/id_dcsr_cluster.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
ejeanvoi@curnagl.dcsr.unil.ch's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'ejeanvoi@curnagl.dcsr.unil.ch'"
and check to make sure that only the key(s) you wanted were added.

Thanks to -i option, you can specify the path to the private key, here we use /home/ejeanvoi/.ssh/id_dcsr_cluster to comply with the beginning of the example. You are asked to enter you UNIL password to access the cluster, and behind the scene, the public key will be automatically copied to the cluster.

Finally, you can connect to the cluster using you key, and that time, you will be asked to enter the passphrase of the key (and not the UNIL password):

[ejeanvoi@archvm ~]$ ssh -i /home/ejeanvoi/.ssh/id_dcsr_cluster ejeanvoi@curnagl.dcsr.unil.ch
Enter passphrase for key '.ssh/id_dcsr_cluster':
Last login: Fri Nov 26 10:25:05 2021 from 130.223.6.87
[ejeanvoi@login ~]$

Remote graphical interface

To visualize a graphical application running from the cluster, you have to connect using -X option:

ssh -X username@curnagl.dcsr.unil.ch

macOS

Like Linux, SSH has a native support in macOS, so nothing special has to be installed, excepted for the graphical part.

Connection with a password

This is similar to the Linux version described above.

Connection with a key

This is similar to the Linux version described above.

Remote graphical interface

To enable graphical visualization over SSH, you have to install an X server. Most common one is XQuartz, it can be installed like any other .dmg application.

Then, you have to add the following line at the beginning of the ~/.ssh/config file (if the file doesn't exist, you can create it):

XAuthLocation /opt/X11/bin/xauth

Finally, just add -X flag to the ssh command and run your graphical applications:

image-1637921404046.png

Windows

To access the DCSR clusters from a Windows host, you have to use an SSH client.

Several options are available:

We present here only MobaXterm (since it's a great tool that also allows to transfer files with a GUI) and the PowerShell options. For both options, we'll see how to connect through SSH with a password and with a key.

MobaXterm

Connection with a password

After opening MobaXterm, you have to create a new session: