Filetransfer from the cluster
https://filetransfer.dcsr.unil.ch is a service provided by the DCSR to allow you to transfer files to and from external collaborators.
This is an alternative to SWITCHFileSender and the space available is 6TB with a maximum per user limit of 4TB - this space is shared between all users so it is unlikely that you will be able to transfer 4TB of data at once.
The filetransfer service is based on LiquidFiles and the user guide is available at https://man.liquidfiles.com/userguide.html
In order to transfer files to and from the DCSR clusters without using the web browser it is also possible to use the CLI tools as explained below
Configuring the service
First you need to connect to the web interface at https://filetransfer.dcsr.unil.ch and connect using your UNIL username (e.g. ulambda for Ursula Lambda) and password. This is not your EduID password but rather the one you use to connect to the clusters.
Once connected go to settings (the cog symbol in the top right corner) then the API tab
The API key is how you authenticate from the clusters and this secret should never be shared. It can be reset via the yellow button.
Transferring files from the cluster
Connect to the login node and load the liquidfiles module
[ulambda@login ~]$ module load liquidfiles [ulambda@login ~]$ liquidfiles Usage: liquidfiles <command> <command_args> Valid commands are: attach Uploads given files to server. attach_chunk Uploads given chunk of file to server. delete_attachments Deletes the given attachments. delete_filelink Deletes the given filelink. download Download given files. file_request Sends the file request to specified user. filedrop Sends the file(s) by filedrop. filelink Uploads given file and creates filelink on it. filelinks Lists the available filelinks. get_api_key Retrieves api key for the specified user. messages Lists the available messages. send Sends the file(s) to specified user. Type 'liquidfiles help <command_name>' to see command specific options and usage. Abnormal exit codes: 1 Command line arguments are invalid - Invalid command name, missing required argument, invalid value for specific argument. 2 CURL error - Can't connect to host, connection timeout, certificate check failure, etc. 3 Error during file upload - Invalid API key, Invalid filename, etc. 4 Error during file send to user. 5 Error in file system - Can't open file, etc.
For example to upload a file and create a file link
liquidfiles filelink --server=https://filetransfer.dcsr.unil.ch --api_key=9MUQeF5nG899lHdCtg myfile.dat
You can then connect to the web interface from you workstation to manage the files and send messages as required.
As preparing and uploading files can take a while we recommend that this is performed in a tmux session which means that even if your connection to the cluster is lost the process continues and you can reconnect.
Transferring large files
If using a single file upload doesn't work and it is not possible to split the data into multiple smaller files then the following information may be useful
Staging the files
We recommend that you create TAR files containing the data you wish to transfer and stage this in your /scratch space. Depending on the data type it can be useful to compress it first.
$ cd /scratch/ulambda $ mkdir mytransfer $ cd mytransfer $ tar -cvf mydata.tar /work/path/to/my/data
Then calculate the checksum of the file to be transfered
$ sha256sum mydata.tar 7aac249b9ec0835361f44c84921a194e587a38daecadf302e9dec44386c9fb36 mydata.tar
Split the file and transfer chunks
Whilst it might be possible to transfer huge files in one upload, it isn't recommended and above ~100GB we recommend that you follow the procedure given below.
Split the file into chunks
$ split --verbose -d -a4 -b1G mydata.tar creating file 'x0000' creating file 'x0001' creating file 'x0002' creating file 'x0003' .. .. creating file 'x0102'
In the staging directory this will create files of exactly 1GB in size- here Usrula's file is 102.5 GB so there are 103 chunks
Use a loop and the attach_chunk command
First we need to know how many files there are
$ ls x* | wc -l 103
This is because we need to tell the service how many bits the file has been split into so it knows when the upload is complete.
Now we note our API key and use the following bash loop (this can also be put in a script).
$ module load liquidfiles $ for a in `seq -w 0 102`; do liquidfiles attach_chunk --server=https://filetransfer.dcsr.unil.ch --api_key=9MUQeF5nG899lHdCtg --chunk=$a --chunks=103 --filename=mydata.tar x0$a; done Uploading chunk 'x0000'. 100% [================================================================================] Current chunk uploaded successfully. Uploading chunk 'x0001'. 100% [================================================================================] Current chunk uploaded successfully. .. Uploading chunk 'x0102'. 100% [================================================================================] All chunks of file uploaded successfully. ID: FP0LAQ9FGFAosPNioe6ZyQ
Alternatively we can also use variables which makes the loop cleaner and easier to put in a script:
module load liquidfiles SERVER=https://filetransfer.dcsr.unil.ch KEY=9MUQeF5nG899lHdCtg CHUNKS=103 MYFILE=mydata.tar NC=`expr $CHUNKS - 1` for a in `seq -w 0 $NC`; do liquidfiles attach_chunk --server=$SERVER --api_key=$KEY --chunk=$a --chunks=$CHUNKS --filename=$MYFILE x0$a; done
A shell script that does the same things is
#!/bin/bash for a in `seq -w 0 102`; do liquidfiles attach_chunk --server=https://filetransfer.dcsr.unil.ch --api_key=9MUQeF5nG899lHdCtg --chunk=$a --chunks=103 --filename=mydata.tar x0$a done
Once all the chunks are uploaded the file will be assembled/processed and after a short while it will be visible in the web interface.
Here we see a previously uploaded file of 304 GB called my file.ffdata
Once the file is uploaded please don't forget to clean up the TAR file and the chunks.
$ cd /scratch/ulambda/mytransfer $ rm * $ cd .. $ rmdir mytransfer