Performance of LLM backends and models in Curnagl
TODO
- Introduction (Cristian)
- Backends and models tested (Margot)
- Hardware description (Margot)
- Inference latency results (Margot and Cristian) -> create one table per model and replace nodes names by GPU card name, we can also improve column titles.
Introduction
Backends and models tested
Hardware description
Inference latency results