Filetransfer from the cluster
filetransfer.dcsr.unil.ch
https://filetransfer.dcsr.unil.ch is a service provided by the DCSR to allow you to transfer files to and from external collaborators.
This is an alternative to SWITCHFileSender and the space available is 6TB with a maximum per user limit of 4TB - this space is shared between all users so it is unlikely that you will be able to transfer 4TB of data at once.
The filetransfer service is based on LiquidFiles and the user guide is available at https://man.liquidfiles.com/userguide.html
In order to transfer files to and from the DCSR clusters without using the web browser it is also possible to use the CLI tools as explained below
Configuring the service
First you need to connect to the web interface at https://filetransfer.dcsr.unil.ch and connect using your UNIL username (e.g. ulambda for Ursula Lambda) and password. This is not your EduID password but rather the one you use to connect to the clusters.
Once connected go to settings (the cog symbol in the top right corner) then the API tab
The API key is how you authenticate from the clusters and this secret should never be shared. It can be reset via the yellow button.
Transferring files from the cluster
Connect to the login node and load the liquidfiles module
[ulambda@login ~]$ module load liquidfiles
[ulambda@login ~]$ liquidfiles
Usage:
liquidfiles <command> <command_args>
Valid commands are:
attach Uploads given files to server.
attach_chunk Uploads given chunk of file to server.
delete_attachments Deletes the given attachments.
delete_filelink Deletes the given filelink.
download Download given files.
file_request Sends the file request to specified user.
filedrop Sends the file(s) by filedrop.
filelink Uploads given file and creates filelink on it.
filelinks Lists the available filelinks.
get_api_key Retrieves api key for the specified user.
messages Lists the available messages.
send Sends the file(s) to specified user.
Type 'liquidfiles help <command_name>' to see command specific options and usage.
Abnormal exit codes:
1 Command line arguments are invalid - Invalid command name, missing required argument, invalid value for specific argument.
2 CURL error - Can't connect to host, connection timeout, certificate check failure, etc.
3 Error during file upload - Invalid API key, Invalid filename, etc.
4 Error during file send to user.
5 Error in file system - Can't open file, etc.
For example to upload a file and create a file link
liquidfiles filelink --server=https://filetransfer.dcsr.unil.ch --api_key=9MUQeF5nG899lHdCtg myfile.dat
You can then connect to the web interface from you workstation to manage the files and send messages as required.
Transferring large files
Using the service it is possible to transfer large (100 GB to 1TB files) although this is not without problems.
As preparing and uploading files can take a while we recommend that this is performed in a tmux session which means that even if your connection to the cluster is lost the process continues and you can reconnect.
Staging the files
We recommend that you create TAR files containing the data you wish to transfer and stage this in your /scratch space. Depending on the data type it can be useful to compress it first.
$ cd /scratch/ulambda
$ mkdir myytransfer
$ cd myytransfer
$ tar -cvf mydata.tar /work/path/to/my/data
Then calculate the checksum of the file to be transfered
$ sha256sum mydata.tar
7aac249b9ec0835361f44c84921a194e587a38daecadf302e9dec44386c9fb36 mydata.tar
Split the file and transfer chunks
Whilst it might be possible to transfer huge files it isn't recommended - above ~10GB we recommend that you follow the procedure given below.
Split the file into chunks
$ split --verbose -d -a4 -b1G mydata.tar
creating file 'x0000'
creating file 'x0001'
creating file 'x0002'
creating file 'x0003'
..
..
creating file 'x0102'
In the staging directory this will create files of exactly 1GB in size- here Usrula's file is 102.5 GB so there are 103 chunks
Use a loop and the attach_chunk command
First we need to know how many files there are
$ ls x* | wc -l
103
This is because we need to tell the service how many bits the file has been split into so it knows when the upload is complete.
Now we note our API key and use the following bash loop (this can also be put in a script).
$ for a in `seq -w 0 102`; do liquidfiles attach_chunk --server=https://filetransfer.dcsr.unil.ch --api_key=9MUQeF5nG899lHdCtg --chunk=$a --chunks=103 --filename=mydata.tar x0$a; done
Uploading chunk 'x0000'.
100% [================================================================================]
Current chunk uploaded successfully.
Uploading chunk 'x0001'.
100% [================================================================================]
Current chunk uploaded successfully.
..
Uploading chunk 'x0102'.
100% [================================================================================]
All chunks of file uploaded successfully. ID: FP0LAQ9FGFAosPNioe6ZyQ
Once all the chunks are uploaded the file will be assembled/processed and after a short while it will be visible in the web interface.
Here we see a previously uploaded file of 304 GB called my file.ffdata
Cleaning up
Once the file is uploaded please don't forget to clean up the TAR file and the chunks.
$ cd /scratch/ulambda/mytransfer
$ rm *
$ cd ..
$ rmdir mytransfer