hpc-ch forum on HPC and Data as a Service

Europe/Zurich
Y35-F-51 Teleteaching (UZH)

Y35-F-51 Teleteaching

UZH

Universität Zürich Winterthurerstrasse 190 8057 Zürich
Bastian Bukatz (UZH), Michele De Lorenzi (CSCS)
Description

Description

Scientific progress is mainly based on the exchange of researchers worldwide. It is, therefore, essential to enable access to the HPC systems and the underlying data as simply as possible. As-a-service offerings provide endpoints for users to interface with, which are usually API-driven but can commonly be controlled via a web console in a user's web browser.

Internally complex HPC systems generally need to possess a high degree of internal automation to provide varying levels of fault tolerance and resiliency, the ability to scale up and down to meet capacity and performance requirements of the workloads submitted to the service by its users, and are usually intended to operate their day to day functions without the need for human intervention.

In this forum, we want to explore specific challenges for the HPC community and potential solutions in an as-a-service scenario.

Key Questions

  • What service do our users expect?
  • What specific challenges provide HPC and the underlying data-as-a-service
  • How to find the balance between security and openness
  • What automation mechanisms help in an as-a-service scenario?
     
Participants
71
    • 09:30 10:00
      Welcome Coffee & Registration 30m
    • 10:00 10:15
      Welcome and Introduction 15m
      Speakers: Bastian Bukatz (UZH), Michele De Lorenzi
    • 10:15 10:45
      Aligning Vision and Language in Embodied AI Tasks 30m

      A human moving around a busy urban environment is surrounded by signs, driving traffic, and other people. An artificial agent navigating a virtual environment of busy city streets using language instructions is also challenged with determining relevant information.

      In this presentation, we introduce the research area of embodied AI, focusing on transformer-based agents in navigation tasks.

      Speaker: Jason Armitage (University of Zurich)
    • 10:45 11:15
      CSCS Service Catalog on "Alps": Compute & Data 30m
      Speaker: Maria Grazia Giuffreda (CSCS)
    • 11:15 11:45
      Jupyterhub as a Playground for Everyone 30m

      Jupyterhub is a web service that allows developers (scientists) to use their browsers to develop on a remote machine without deep IT knowledge. On Euler, we provide this service on k8s and create the user's sessions with slurm. The key to the success of our implementation comes from its wide range of customization available to both the HPC team and the users. This customization allows us to provide highly customizable Jupyter sessions and other software through a single frontend. By reusing the same simple tricks over and over, we have been able to provide RServer, Tensorboard and CodeServer. In this presentation, I will show how Jupyterhub works with our slum API and explain the different tricks to allow users to customize their instances and run any other type of service.

      Speaker: Loic Hausammann (ETH Zurich)
    • 11:45 12:15
      The First Tenant-Managed "Alps" vCluster - a presentation by CSCS and PSI 30m
      Speakers: Derek Feichtinger (PSI), Hussein Harake (CSCS)
    • 12:15 13:15
      Lunch and Networking 1h
    • 13:15 13:45
      Unlocking the Potential of HPC Systems Use for Optimizing Job Queues, Minimizing Wait Times, and Boosting Efficiency 30m

      In this talk, we delve into using High-Performance Computing (HPC) systems to tackle a critical challenge: the unnecessary waiting that hampers productivity due to resource request inaccuracies and job inefficiency.

      Our exploration centres on:
      1. Devising appropriate job queue configurations.
      2. Uncovering the relation between resource requests and wait times.
      3. Improving the way jobs utilize their allocated resources.
      4. Deploying novel methods for automatic detection and reporting of inefficiencies.

      This work is a data-driven journey through analysing 55 million jobs submitted by 500 users over eight months, all hosted on a University cluster to redefine HPC systems efficiency.

      Speaker: Thomas Jakobsche (University of Basel)
    • 13:45 14:15
      How to Find the Balance between Security and Openness in the Context of Scientific Research Data 30m

      The balance between security and openness is an ongoing and evolving challenge. Security encompasses a broad spectrum, including national interests, cybersecurity, personal safety, intellectual property protection, and Research Data. On the other hand, openness embodies transparency, collaboration, and the sharing of information and ideas. Striking the right balance between these principles is essential for fostering innovation and ensuring the resilience of systems in the face of evolving threats. This presentation will touch on the topic's surface and discuss challenges one can face when implementing FAIR principles and Data Management plans for Research projects.

      Speaker: Victor Holanda Rusu (CSCS)
    • 14:15 14:45
      Community Development 30m

      The session will be dedicated to:

      1. Members Update: Latest Developments in our Organization.
        Members are kindly invited to give a short update on their activities.
      2. Preparations for the next hpc-ch forum on Business Continuity, Crises Management, and Energy Savings & Efficiency, May 23, 2024 (SWITCH)
    • 14:45 15:45
      Meerkat Guided Tour 1h

      The Meerkat project uses an online platform on ScienceCloud for animal behaviour scientists to access the project's data and perform basic analyses on them.The scientific project stretches back for over two decades and concentrates on capturing audio samples of meerkat communication to understand their behaviour better.

    • 15:45 16:00
      Farewell and End of the Meeting 15m