Passer au contenu principal

R on the clusters (New)

R is provided via the DCSR software stack

Interactive mode

To load R:

$> module load r-light
$> R
# Then you can use R interactively
> ...

By default, you get the last version available (4.4.1 when this page is written). If you need an older version, you can list the available versions as follows:

$> module spider r-light
----------------------------------------------------------------------------
  r-light:
----------------------------------------------------------------------------
     Versions:
        r-light/3.6.3
        r-light/4.0.5
        r-light/4.1.3
        r-light/4.2.3
        r-light/4.3.3
        r-light/4.4.1

Then you can load a specific version:

$> module load r-light/4.0.5
$> R --version
R version 4.0.5 (2021-03-31) -- "Shake and Throw"

Batch mode

While using R in batch mode, you have to use Rscript to launch your script. Here is an example of sbatch script, run_r.sh:

#!/bin/bash

#SBATCH --time 00-00:20:00
#SBATCH --cpus-per-task 1
#SBATCH --mem 4G

module load r-light

Rscript my_r_script.R

Then, just submit the job to Slurm:

sbatch run_r.sh

Package installation

A few core packages are installed centrally - you can see what is available by using the library() function. Given the number of packages and multiple versions available, other packages should be installed by the user. 

Library relocation

By default, when you install R packages, R will try to install them in the central installation. Since this central installation is shared among all users on the cluster, it's obviously impossible to install directly your packages there. This is why this location is not writable and you will get this kind of message:

$> R
> install.packages("ggplo2")
Warning in install.packages("ggplo2") :
  'lib = "/opt/R-4.4.1/lib/R/library"' is not writable
Would you like to use a personal library instead? (yes/No/cancel)

This is why you have to answer yes to this "Would you like to use a personal library instead?" question.

By default, this personal library is located in your home directory. On DCSR clusters, this home directory is pretty limited regarding the amount of data (50 GB at most) and the number of files (200'000 files at most) you can store. Installing R packages in your home directory could quickly fill all the available space. This is why your personal library should be relocated.

A good practice is to relocate your R library to a location in one your work project. Let's consider your work project is located in /work/FAC/Lettres/GREAT/ulambda/default, you create a sub-directory inside, for instance /work/FAC/Lettres/GREAT/ulambda/default/RLIB_for_ursula. Then you have several options to tell R that you want to use this new personal library, but the easiest way is to define the R_LIBS_USER variable.

Thus, you can either add the following line in all your Slurm scripts (before R is invoked):

export R_LIBS_USER=/work/FAC/Lettres/GREAT/ulambda/default/RLIB_for_ursula
Rscript …

Or you can also define it in the ~/.Renviron. You just have to add the following line to the file:

R_LIBS_USER=/work/FAC/Lettres/GREAT/ulambda/default/RLIB_for_ursula

The second option using  ~/.Renviron is probably cleaner but the first option is more versatile, especially if you want to use several personal libraries depending on different projects and requirements.

 

CRAN packages

Installing R packages from CRAN is pretty straightforward thanks to install.packages() function. However, be careful since it might fill your home directory very quickly. For big packages with large amount of dependencies, like adegenet for instance, you will probably reach the quota before the end of the installation. Here is a solution to mitigate that problem:instance:

rm -rf $HOME/R"dplyr"))
  • Create a new library in your scratch directory (obviously modify the path according to your situation):
mkdir -p /work/FAC/FBM/DEE/my_py/default/jdoe/R
cd $HOME
ln -s /work/FAC/FBM/DEE/my_py/default/jdoe/R
  • Install your R packages

BioConductor packages

The first step is to install the BioConductor package manager, and then to install packages with BiocManager::install(). For instance:

$> module load r-light
$> R
> install.packages("BiocManager")
> BiocManager::install("biomaRt")

Github/development packages

Handling

To dependencies

Sometimes Rinstall packages depend on external libraries. For most of cases the library is already installed on the cluster you just need to load the module before trying to install the package from theGithub/Gitlab Ror session.

random

If the installation of package is still failing you need to define the following variables. For example, if our package depend on gsl and mpfr libraries, we need to do the following:

module load gsl mpfr
export CPATH=$GSL_ROOT/include:$MPFR_ROOT/include
export LIBRARY_PATH=$GSL_ROOT/lib:$MPFR_ROOT/lib

Setting up an alternate personal library

If you want to set up an alternate location where to install R packages,websites, you can proceeduse the devtools library as follows:

mkdir$> -pmodule ~load r-light
$> R
> library(devtools)
> install_github("N-SDM/covsel")
> install_url("https:/R/my_personal_lib2/cran.r-project.org/src/contrib/Archive/rgdal/rgdal_1.6-7.tar.gz")
#

Missing Ifdependencies

In some cases, it's possible that package installation fails because of missing dependencies. In such case, please send us an email to helpdesk@unil.ch with the subject starting with "DCSR R package installation". And please provide us with the name of the package that you alreadycannot have a ~/.Renviron file, make a backup cp -iv ~/.Renviron ~/.Renviron_backup echo 'R_LIBS_USER=~/R/my_personal_lib2' > ~/.Renviron

Then relaunch R. Packages will then be installed under ~/R/my_personal_lib2.install.