Performance of LLM backends and models in Curnagl

TODO

Introduction (Cristian)
Backends and models tested (Margot)
Hardware description (Margot)
Inference latency results (Margot and Cristian) -> create one table per model and replace nodes names by GPU card name, we can also improve column titles.

Introduction

Backends and models tested

Hardware description

Inference latency results

Retour en haut