Passer au contenu principal

Filetransfer from the cluster

filetransfer.dcsr.unil.ch

https://filetransfer.dcsr.unil.ch is a service provided by the DCSR to allow you to transfer files to and from external collaborators.

This is an alternative to SWITCHFileSender and the space available is 6TB with a maximum per user limit of 4TB - this space is shared between all users so it is unlikely that you will be able to transfer 4TB of data at once.

The filetransfer service is based on LiquidFiles and the user guide  is available at https://man.liquidfiles.com/userguide.html

In order to transfer files to and from the DCSR clusters without using the web browser it is also possible to use the CLI tools as explained below

 

Configuring the service

 

First you need to connect to the web interface at https://filetransfer.dcsr.unil.ch and connect using your UNIL username (e.g. ulambda for Ursula Lambda) and password. This is not your EduID password but rather the one you use to connect to the clusters.

Once connected go to settings (the cog symbol in the top right corner) then the API tab

 

Screenshot 2022-01-25 at 10.11.35.png

The API key is how you authenticate from the clusters and this secret should never be shared. It can be reset via the yellow button.

 

Transferring files from the cluster

 

Connect to the login node and load the liquidfiles module

[ulambda@login ~]$ module load liquidfiles

[ulambda@login ~]$ liquidfiles 
Usage:
	liquidfiles <command> <command_args>

Valid commands are:
	attach                 Uploads given files to server.
	attach_chunk           Uploads given chunk of file to server.
	delete_attachments     Deletes the given attachments.
	delete_filelink        Deletes the given filelink.
	download               Download given files.
	file_request           Sends the file request to specified user.
	filedrop               Sends the file(s) by filedrop.
	filelink               Uploads given file and creates filelink on it.
	filelinks              Lists the available filelinks.
	get_api_key            Retrieves api key for the specified user.
	messages               Lists the available messages.
	send                   Sends the file(s) to specified user.

Type 'liquidfiles help <command_name>' to see command specific options and usage.

Abnormal exit codes:
	1     Command line arguments are invalid - Invalid command name, missing required argument, invalid value for specific argument.
	2     CURL error - Can't connect to host, connection timeout, certificate check failure, etc.
	3     Error during file upload - Invalid API key, Invalid filename, etc.
	4     Error during file send to user.
	5     Error in file system - Can't open file, etc.

For example to upload a file and create a file link

You can then connect to the web interface from you workstation to manage the files and send messages as required.

 

Transferring large files

Using the service it is possible to transfer large (100 GB to 1TB files) although this is not without problems.

As preparing and uploading files can take a while we recommend that this is performed in a tmux session which means that even if your connection to the cluster is lost the process continues and you can reconnect. 

Staging the files

We recommend that you create TAR files containing the data you wish to transfer and stage this in your /scratch space. Depending on the data type it can be useful to compress it first.

$ cd /scratch/ulambda
$ mkdir myytransfer
$ cd myytransfer
$ tar -cvf mydata.tar /work/path/to/my/data

Then calculate the checksum of the file to be transfered

$ sha256sum mydata.tar
7aac249b9ec0835361f44c84921a194e587a38daecadf302e9dec44386c9fb36  mydata.tar
 
Split the file and transfer chunks

Whilst it might be possible to transfer huge files it isn't recommended - above ~10GB we recommend that you follow the procedure given below.

Split the file into chunks

$ split --verbose -d -a4 -b1G mydata.tar
creating file 'x0000'
creating file 'x0001'
creating file 'x0002'
creating file 'x0003'
..
..
creating file 'x0102'

In the staging directory this will create files of exactly 1GB in size- here Usrula's file is 102.5 GB so there are 103 chunks

Use a loop and the attach_chunk command

First we need to know how many files there are

$ ls x* | wc -l
103

This is because we need to tell the service how many bits the file has been split into so it knows when the upload is complete.  

Now we note our API key and use the following bash loop (this can also be put in a script). 

$ for a in `seq -w 0 102`; do liquidfiles attach_chunk --server=https://filetransfer.dcsr.unil.ch --api_key=9MUQeF5nG899lHdCtg --chunk=$a --chunks=103 --filename=mydata.tar x0$a; done

Uploading chunk 'x0000'.
100% [================================================================================]
Current chunk uploaded successfully.
Uploading chunk 'x0001'.
100% [================================================================================]
Current chunk uploaded successfully.
..
Uploading chunk 'x0102'.
100% [================================================================================]

All chunks of file uploaded successfully. ID: FP0LAQ9FGFAosPNioe6ZyQ

Once all the chunks are uploaded the file will be assembled/processed and after a short while it will be visible in the web interface.

Here we see a previously uploaded file of 304 GB called my file.ffdata

Screenshot 2022-02-11 at 20.19.32.png