Passer au contenu principal

Measuring job's CO2 footprint

There are three main ways in which the use of the HPC clusters can be more taxing for the environment than it needs to be:

  1. by using more of the cluster RAM (Random Allocated Memory) than needed for your calculations (i.e., the "job" you submit to the cluster),
  2. by having your submitted jobs crash
  3. by requesting more cores (i.e., computing units) for a job than needed.

These all imply waste of energy. To help minimize them, the GreenAlgorithms4HPC package was installed on the clusters. It can estimate the carbon output and energy consumption of the user, either for a particular job run on the clusters, or over a time period that you specify. In addition, it can also measure how much memory is being used for the jobs, compared to how much is actually required to run the job.

Green Algorithms

The methodolgy is based on Green Algorithms developed by Loïc Lannelongue. He developed the package GreenAlgorithms4HPC which is a plugin to process the accounting information of a cluster HPC in order to provide an estimation of CO2 footprint.

How to use it

You need to load the following module:

module load ga4hpc

And then you can check your CO2 footprint for a period of time:

green_hpc -S 2025-11-24 -E 2025-11-25

The following output is generated:

        #################################                                                                              
        #                               #                                                                              
        #  Carbon footprint on curnagl  #
        #       - user: cruiz1 -        #
        #   (2025-11-24 / 2025-11-25)   #  
        #################################                                                                              
                                                                                                                                                         
                                                                                                                                                         
              --------------                                                
             |   51 gCO2e   |                                                                                                                            
              --------------                                                                                                                             
                                                                                                                                                         
    ...This is equivalent to:                                               
         - 0.055 tree-months                                                
         - driving 0.29 km                                                  
         - 0.0 flights between Paris and London

There are several options to filter jobs that you can check with:

green_hpc -h 

You can also get an estimation for a particular job:

green_hpc -S 2025-11-24 -E 2025-11-25 

How precise is the estimation?

The estimation does not take into account CO2 produced during manufacturing. It deals related with the power used during usage of computing. The power usages is based on the TDP (Termal Desing Power) information providaded by the manufacturer. This value is a limit of the power comsomption a CPU, GPU could have. The computing value is:

Power consumption = time * (resources 1 * TDP + resources_2* TPD + ...)

Results of some tests:

config appli GA mesured real
cpu 48 cores cpu benchmark NAS 0.343 0.3017
2 gpu A100 julia heat equation 0.355 0.350
2 gpu A100 LLM inference 0.376 0.234

Assumpions and limitations

The package works using information stored by the clusters’ workload manager.

  • The workload manager doesn't always log the exact CPU usage time, and when this information is missing, all cores are assumed to be used at 100%. This may lead to slightly overestimated carbon footprints, although the order of magnitude is probably correct.
  • Conversely, the wasted energy due to memory overallocation may be largely underestimated, as the information needed is not always logged.
  • Only the carbon imprint of cluster use is measured, not the impact of cooling the computers down, or of building the facilities.

However, the package cannot access the information on the same day as the job was run, meaning that in order to measure the output of jobs you performed on day 1, you need to wait until day 2. In addition: