Speaker
Thomas Jakobsche
(University of Basel)
Description
In this talk, we delve into using High-Performance Computing (HPC) systems to tackle a critical challenge: the unnecessary waiting that hampers productivity due to resource request inaccuracies and job inefficiency.
Our exploration centres on:
1. Devising appropriate job queue configurations.
2. Uncovering the relation between resource requests and wait times.
3. Improving the way jobs utilize their allocated resources.
4. Deploying novel methods for automatic detection and reporting of inefficiencies.
This work is a data-driven journey through analysing 55 million jobs submitted by 500 users over eight months, all hosted on a University cluster to redefine HPC systems efficiency.