hpc-ch forum on HPC Workflows and Batch Systems

Europe/Zurich
HG D 16.2 (ETH Zurich)

HG D 16.2

ETH Zurich

Rämistrasse 101, 8006 Zürich
Christian Bolliger (ETH Zurich), Maria Grazia Giuffreda (ETH Zurich / CSCS)
Description

Overview:
The HPC-CH Forum is an annual gathering dedicated to exploring the latest advancements in High-Performance Computing (HPC) workflows and batch systems. This year, the forum will bring together experts from academia, industry, and research institutions to discuss cutting-edge technologies and share insights into the future of HPC workflows, batch systems and scheduling in general. Attendees will have the opportunity to engage with thought leaders and participate in interactive sessions designed to foster collaboration and innovation in the HPC community.

Key Topics:

  • Optimization of HPC workflows for enhanced performance
  • Integration of emerging technologies in batch systems
  • Integration of interactive user interfaces in batch systems
  • Case studies on successful HPC workflows implementations
  • Challenges and solutions in managing HPC resources
  • Future trends in HPC workflows and their impact on research and industry


Who Should Attend?
This forum is tailored for professionals involved in HPC, including researchers, system administrators, software developers, and industry experts. It's an ideal platform for those who are interested in enhancing their knowledge of HPC workflows and batch systems, and for those looking to build a network with leading experts in the field.

    • 09:30 10:00
      Welcome Coffee & Registration
    • 10:00 10:15
      Welcome and Introduction 15m
      Speakers: Christian Bolliger (ETH Zurich), Maria Grazia Giuffreda (ETH Zurich / CSCS)
    • 10:15 11:00
      A Short History of Batch Systems 45m

      Batch systems are dating back to the time of punch cards. For most of the computer users are not confronted with them anymore, the processing of tasks is hidden from them. But in the world of high performance computing they play still a crucial role.
      The history of batch systems is in fact also a history of supercomputing.

      Speaker: Christian Bolliger (ETH Zurich)
    • 11:00 11:30
      Hyper-Converged Cloud Infrastructure at CSCS 30m

      This presentation covers the hyper-converged cloud infrastructure at the Swiss National Supercomputing Centre (CSCS), focusing on the integration of Kubernetes (RKE2), ArgoCD, and Rancher for managing and deploying clusters.

      Rancher handles deployment on bare-metal and HPC nodes, while Harvester manages virtual clusters. ArgoCD automates deployments and ensures consistency, enabling continuous delivery. The combination of Kubernetes, ArgoCD, Rancher, Harvester, and Terraform creates a scalable, adaptable cloud infrastructure.

      This case study outlines the architecture, workflows, and benefits of this approach.

      Speaker: Dino Conciatore (CSCS)
    • 11:30 12:00
      Leveraging High-Performance Computing for Physics-Based Ground Motion Modelling: Applications in Seismic Hazard 30m

      This talk presents a framework for ground motion modelling based on physics-based simulations of seismic wave propagation using GPU-based numerical methods. I will highlight efforts to incorporate complex site properties, such as topography and near-surface geotechnical layers, into advanced earthquake simulations. These simulations enable the development of digital twins for Swiss sites to better inform both countrywide and local hazard estimates and further contribute to probabilistic seismic hazard assessment (PSHA).

      Speaker: Maria Koroni (ETH Zurich)
    • 12:00 12:30
      CryoSPARC Containerization and Integration with SLURM Scheduler: Technical Talk

      In this presentation, the installation and deployment of CryoSPARC will be described, using docker-compose and Portainer to simplify service management. The deployment is tailored for integration with a SLURM cluster. A containerized setup was chosen to simplify the management and deployment, and to enable clean separation of data and project files between different user groups utilizing CryoSPARC.

      Convener: Nicolas Richart (EPFL)
    • 12:30 13:30
      Lunch and Networking 1h
    • 13:30 14:00
      Managing Slurm on Clusters with Emerging Technologies 30m

      User expectations of clusters are changing, meanwhile modern HPC clusters contain resources that are not fully captured by the batch system's cluster model. We present the challenges and developments of bridging the batch system and the cluster to improve the user experience.

      Speaker: Urban Borštnik (ETH Zurich)
    • 14:00 14:30
      Interfacing AiiDA to FirecREST: A RESTful Approach to HPC Workflows 30m

      In today’s supercomputing landscape, securing both computational resources and results has become increasingly important. Usually, HPC systems are interfaced with users via SSH. However, due to security vulnerabilities and the expensive costs of login servers, many supercomputing centers are now exploring modern alternatives.

      One promising alternative is to adopt REST API–based solutions to manage HPC resources. While such APIs provide enhanced security and lower operational costs, they also introduce challenges due to increased latency of REST compared to SSH. This performance limitation might presents a challenge for developers of large-scale, high-performance workflow managers, such as AiiDA, to redesign their systems in ways that handles server communication delays efficiently.

      This presentation will explore how AiiDA, as a high-performance workflow manager, has been adapted to interface with FirecREST—a REST API–based interface for managing HPC resources. I will discuss some of the technical challenges we faced and how we addressed them to maintain performance.

      Speaker: Ali Khosravi (PSI)
    • 14:30 15:00
      Community Development 30m
    • 15:00 15:30
      Transfer to Guided Tour location
    • 15:30 16:30
      Guided Tours 1h

      We offer two different guided tours:

      1. focusTerra - Earthquake Simulator

      In the permanent exhibition of focusTerra, we explore the causes of earthquakes, watch tectonic plates move throughout our planet’s history and talk about the scientific value of quakes. In the simulator you will experience earthquakes in a safe environment and learn what to do in a case of emergency.

      1. Datacentre at ETH Zurich
    • 16:30 16:45
      Farewell and End of the Meeting 15m