Course software for decision trees / random forests

In the practicals, we will use only a small dataset and we will need only little computation power and memory ressources. You can therefore do the practicals on various computing platforms. However, since the participants may use various types of computers and softwares, we recommend to use the UNIL JupyterLab to do the practicals. 

If you choose to work on the UNIL JupyterLab, then you do not need to prepare anything since all the necessary libraries will already be installed on the UNIL JupyterLab. In all cases, you will receive a guest username during the course, so you will be able to work on the UNIL JupyterLab.

Otherwise, if you prefer to work on your laptop or on Curnagl, please make sure you have a working installation before the day of the course as on the day we will be unable to provide any assistance with this. If you have difficulties with the installation on Curnagl we can help you so please contact us before the course at helpdesk@unil.ch with subject: DCSR ML course.

On the other hand, if we are unable to install the libraries on your laptop, we will unfortunately not be able to help you (there are too many particular cases), so you will need to use the UNIL Jupyter Lab during the course. 

JupyterLab

Here are some instructions for using the UNIL JupyterLab to do the practicals.

You need to be able to access the eduroam wifi with your UNIL account or via the UNIL VPN.

Go to the webpage:  https://jupyter.dcsr.unil.ch/jupyter

Enter the login and password that you have received during the course. Due to a technical issue, you may receive a warning message "Your connection is not private". This is OK. So please proceed by clicking on the advanced button and then on "Proceed to dcsrs-jupyter.ad.unil.ch (unsafe)".

Python

Click on the "ML" square button in the Notebook panel.

Copy / paste the commands from the html practical file to the Jupyter Notebook.  

To execute a command, click on "Run the selected cells and advance" (the right arrow), or SHIFT + RETURN.

When you have finished the practicals, select File / Log out.

R

Click on the "ML R" square button in the Notebook panel.

Copy / paste the commands from the html practical file to the Jupyter Notebook.

To execute a command, click on "Run the selected cells and advance" (the right arrow), or SHIFT + RETURN.

When you have finished the practicals, select File / Log out.

Laptop

You may need to install development tools including a C and Fortran compiler (e.g. Xcode on Mac, gcc and gfortran on Linux, Visual Studio on Windows).

Python installation

Here are some instructions for installing decision tree and random forest libraries on your laptop. You need Python >= 3.7.

For Mac and Linux

We will use a terminal to install the libraries.

Let us create a virtual environment. Open  your terminal and type:

python3 -m venv mlcourse

source mlcourse/bin/activate

pip3 install scikit-learn pandas matplotlib graphviz seaborn

You can terminate the current session:

deactivate

exit

TO DO THE PRACTICALS (today or another day):

You can use any Python IDE (e.g. Jupyter Notebook or PyCharm), but you need to launch it after activating the virtual environment. For example, for Jupyter Notebook:

source mlcourse/bin/activate

pip3 install notebook

jupyter notebook
For Windows

If you do not have Python installed, you can use either Conda: https://docs.conda.io/en/latest/miniconda.html or Python official installer: https://www.python.org/downloads/windows/ 

Let us create a virtual environment. Open  your terminal and type:

C:\Users\user>python -m venv mlcourse

C:\Users\user>mlcourse\Scripts\activate.bat

(mlcourse) C:\Users\user>

(mlcourse) C:\Users\user>pip3 install scikit-learn pandas matplotlib graphviz seaborn

You can terminate the current session:

(mlcourse) C:\Users\user>deactivate

C:\Users\user>

TO DO THE PRACTICALS (today or another day):

You can use any Python IDE (e.g. Jupyter Notebook or PyCharm), but you need to launch it after activating the virtual environment. For example, for Jupyter Notebook:

C:\Users\user>mlcourse\Scripts\activate.bat

(mlcourse) C:\Users\user>pip3 install notebook

(mlcourse) C:\Users\user>jupyter notebook

Information: Use Control-C to stop this server.

R installation

Here are some instructions for installing decision tree and random forest libraries on your laptop.

You need R >= 4.0. Run R in your terminal or launch RStudio.

For Windows users, you can download R here: https://cran.r-project.org/bin/windows/base/

REMARK: The R libraries will be installed in your home directory. To allow it, you must answer yes to the questions:

Would you like to use a personal library instead? (yes/No/cancel) yes

Would you like to create a personal library to install packages into? (yes/No/cancel) yes

And select Switzerland for the CRAN mirror.

install.packages("rpart")

install.packages("rpart.plot")

install.packages("randomForest")

install.packages("tidyverse")

The installation of "tidyverse" may lead to some conflicts, but do not worry you should be able to do the practicals fine. 

You can terminate the current R session:

q()

Save workspace image? [y/n/c]: n

TO DO THE PRACTICALS (today or another day):

Simply run R in your terminal or launch RStudio.

Curnagl

For the practicals, it will be convenient to be able to copy/paste text from a web page to the terminal on Curnagl. So please make sure you can do it before the course. You also need to make sure that your terminal has a X server.

For Mac users, download and install XQuartz (X server): https://www.xquartz.org/

For Windows users, download and install MobaXterm terminal (which includes a X server). Click on the "Installer edition" button on the following webpage: https://mobaxterm.mobatek.net/download-home-edition.html

For Linux users, you do not need to install anything.

Python installation

Here are some instructions for installing decision tree and random forest libraries on the UNIL cluster called Curnagl. Open a terminal on your laptop and type (if you are located outside the UNIL you will need to activate the UNIL VPN):

ssh -Y < my unil username >@curnagl.dcsr.unil.ch

Here and in what follows we added the brackets < > to emphasize the username, but you should not write them in the command. Enter your UNIL password.

For Windows users with the MobaXterm terminal: Launch MobaXterm, click on Start local terminal and type the command ssh -Y < my unil username >@curnagl.dcsr.unil.ch. Enter your UNIL password. Then you should be on Curnagl. Alternatively, launch MobaXterm, click on the session icon and then click on the SSH icon. Fill in: remote host = curnagl.dcsr.unil.ch, specify username = < my unil username >. Finally, click ok, enter your password. If you have the question "do you want to save password ?" Say No if your are not sure. Then you should be on Curnagl.

See also the documentation: https://wiki.unil.ch/ci/books/high-performance-computing-hpc/page/ssh-connection-to-dcsr-cluster

cd /work/TRAINING/UNIL/CTR/rfabbret/cours_hpc/

mkdir < my unil username >

cd < my unil username >

For convenience, you will install the libraries from the frontal node to do the practicals. Note however that it is normally recommended to install libraries from the interactive partition by using (Sinteractive -m 4G -c 1).

module load gcc python/3.9.13

python -m venv mlcourse

source mlcourse/bin/activate

pip install scikit-learn pandas matplotlib graphviz seaborn

You can terminate the current session:

deactivate

exit

TO DO THE PRACTICALS (today or another day):

ssh -Y < my unil username >@curnagl.dcsr.unil.ch

cd /work/TRAINING/UNIL/CTR/rfabbret/cours_hpc/< my unil username >

For convenience, you will work directly on the frontal node to do the practicals. Note however that it is normally not allowed to work directly on the frontal node, and you should use (Sinteractive -m 4G -c 1).

module load gcc python/3.9.13

source mlcourse/bin/activate

python

R installation

Here are some instructions for installing decision tree and random forest libraries on the UNIL cluster called Curnagl. Open a terminal on your laptop and type (if you are located outside the UNIL you will need to activate the UNIL VPN):

ssh -Y < my unil username >@curnagl.dcsr.unil.ch

Here and in what follows we added the brackets < > to emphasize the username, but you should not write them in the command. Enter your UNIL password.

For Windows users with the MobaXterm terminal: Launch MobaXterm, click on Start local terminal and type the command ssh -Y < my unil username >@curnagl.dcsr.unil.ch. Enter your UNIL password. Then you should be on Curnagl. Alternatively, launch MobaXterm, click on the session icon and then click on the SSH icon. Fill in: remote host = curnagl.dcsr.unil.ch, specify username = < my unil username >. Finally, click ok, enter your password. If you have the question “do you want to save password ?” Say No if your are not sure. Then you should be on Curnagl.

See also the documentation: https://wiki.unil.ch/ci/books/high-performance-computing-hpc/page/ssh-connection-to-dcsr-cluster

cd /work/TRAINING/UNIL/CTR/rfabbret/cours_hpc/

mkdir < my unil username >

cd < my unil username >

For convenience, you will install the libraries from the frontal node to do the practicals. Note however that it is normally recommended to install libraries from the interactive partition by using (Sinteractive -m 4G -c 1).

module load gcc python/3.9.13 r/4.2.1

R

REMARK: The R libraries will be installed in your home directory. To allow it, you must answer yes to the questions:

Would you like to use a personal library instead? (yes/No/cancel) yes

Would you like to create a personal library to install packages into? (yes/No/cancel) yes

And select Switzerland for the CRAN mirror.

install.packages("rpart")

install.packages("rpart.plot")

install.packages("randomForest")

install.packages("tidyverse")

The installation of "tidyverse" may lead to some conflicts, but do not worry you should be able to do the practicals fine. 

You can terminate the current R session:

q()

Save workspace image? [y/n/c]: n

TO DO THE PRACTICALS (today or another day):

ssh -Y < my unil username >@curnagl.dcsr.unil.ch

cd /work/TRAINING/UNIL/CTR/rfabbret/cours_hpc/< my unil username >

For convenience, you will work directly on the frontal node to do the practicals. Note however that it is normally not allowed to work directly on the frontal node, and you should use (Sinteractive -m 4G -c 1).

module load gcc python/3.9.13 r/4.2.1

R

Révision #55
Créé 12 septembre 2022 08:44:41 par Ewan Roche
Mis à jour 20 novembre 2023 08:37:42 par Philippe Jacquet