Management and Security of Sensitive HPC Data

Europe/Zurich
Auditorium A

Auditorium A

Vital-IT Group SIB Swiss Institute of Bioinformatics Quartier Sorge, Genopode building CH-1015 Lausanne Switzerland
Ioannis Xenarios (Vital-IT), Michele De Lorenzi (CSCS)
Description
Introduction

Over the last 10 years, the emergence and steady improvements of high-throughput technologies in the life science & biomedical fields generated massive amounts of data, providing an unprecedented data-based description of a human individual. The data explosion is expected to last in time, with a growth of 2 to 40 exabytes of genetics and genomics data produced in the next 10 years. The data life cycle management, which comprises the data handling, processing and preservation, of such rich personalised data sets is still challenging.

Besides, fostering interdisciplinary collaborations across mutually "untrusted" third parties (e.g., HPC computing centers, hospitals and industry) while preserving individual data privacy is a topic of friction and intense (computational) development.

In this forum, we would like to discuss the current state of the art of biomedical and other sensitive data, the inherent costs for compliance and standardization and how to preserve data privacy while expert intensive computational analyses are required within a HPC computing environment.

Key questions
  • How to avoid public leaks of sensitive data, protecting by the same time your reputation and that of your institution ?
  • What are the best practices in sensitive data management ?
  • What are the best practices in preserving patient data ?
  • What are the HPC tools and technology to foster multi-disciplinary collaborations across several partners while preserving patient data security ?
  • How can we efficiently manage cost for compliance and standardization ?
  • How to make sense of very large biomedical data by harmonizing very disparate data sets from heterogeneous data source by respecting international institutional regulations regarding patients data security for the aim of Big Data analytics ?
Participants
  • Alex Upton
  • Alexander Kashev
  • Christian Bolliger
  • Christian Iseli
  • Christian Mouchet
  • Claudio Alberti
  • Colin McMurtrie
  • Daniele Passerone
  • Dario Petrusic
  • David Froelicher
  • Derek Heinrich Feichtinger
  • Diana Coman Schmid
  • Eur Ing Fotis Georgatos
  • Gianfranco Sciacca
  • Guy Mael Horclois
  • Hamid Hussain-Khan
  • Heiko Rölke
  • Heinz Stockinger
  • Hon Wai Wan
  • Ioannis Xenarios
  • Iulian Dragan
  • Iulian Dragan
  • Jani Heikkinen
  • Jean Louis Raisaro
  • Jorge Molina
  • Marc Filliettaz
  • Marc Robinson-Rechavi
  • Martial Sankar
  • Martin Jacquot
  • Mattia Belluco
  • Michele De Lorenzi
  • Nick Holway
  • Nico Färber
  • Olivier Byrde
  • Patrick Zosso
  • Pierre Berthier
  • pietro benedusi
  • Raluca Hodoroaba
  • Ricardo Silva
  • Roberto Fabbretti
  • Robin Engler
  • Sadaf Alam
  • Sebastien Moretti
  • Sigve Haug
  • Sofiane Sarni
  • Szymon Gadomski
  • Teo Brasacchio
  • Thomas Wuest
  • Vittoria Rezzonico
  • Volker Flegel
  • Warren Paulus
  • Wolfgang Zipfel
  • Yann Sagon
    • 09:30 10:00
      Coffee and registration 30m
    • 10:00 10:15
      Welcome and introduction
      Conveners: Ioannis Xenarios (Vital-IT), Michele De Lorenzi (CSCS)
    • 10:15 11:00
      Keynote Presentation. Data Protection for Personalized Health

      In this talk, we will describe the challenges of data protection in personalized health and discuss possible solutions. We will also explain how this issue will be addressed in the framework of the Swiss Personalized Health Network.

      Convener: Jean-Pierre Hubaux (EPFL)
      Picture
    • 11:00 11:45
      Keynote Presentation. Data&Computing Services for Personalized Health: a Paradigm Shift

      The talk will introduce the Swiss Personalized Health Ecosystem (SPHN/PHRT projects and beyond) and give an user and usage perspective for handling sensitive biomedical data in secure computing environments. The special context is setup by the patient data, which has high legal and ethical requirements but also high and challenging computing demands.

      To offer top class services for Personalized Health Research, secure and powerful IT infrastructures for data storage, computing and sharing are a must. This is instrumental but not sufficient. What we also need in this diverse and dynamic ecosystem are innovative teams with hybrid expertise (for example, medicine, bioinformatics, IT). With the user experience as a central focus, the challenges are on finding the right balance between security and usability.

      Christian Bolliger will further extend on how Leonhard Med - a secure computing and data platform for Personalized Health - is addressing the requirements and challenges for Personalized Health data handling.

      Convener: Diana Coman Schmid (ETH Zurich)
      Picture
      slides
    • 11:45 12:15
      Building a Secure HPC System - Technical and Regulatory Aspects

      LeonhardMed is a HPC platform designed for the needs of medical research or more specifically for data analytics and AI in medical research. This opens quite a few challenges and constraints.

      The system must fulfil the legal and ethical requirements and give the researchers the tools they need and they are accustomed to. It must be possible to exchange data with similar sites and hospitals while keeping the privacy of patients data.

      Last but not least the platform most be adaptable to future requirements while the system administration should be possible in an efficient way.

      Convener: Christian Bolliger (ETH Zurich)
      Picture
      slides
    • 12:15 13:15
      Lunch and networking 1h
    • 13:15 13:45
      Security Matters

      This presentation will focus on security awareness on management of sensitive data in HPC environment.

      Convener: Guy-Maël Horclois (ETH Zurich / CSCS)
      slides
    • 13:45 14:30
      Keynote Presentation. A Confederation Approach to HPC in Switzerland : Example of Vital-IT Over the Last 15 Years Dedicated to Life Science

      This presentation will highlight the challenges in maintaining a competence center aligned with the necessity of a very diverse and evolving environment.
      Examples ranging from biomedical to basic science will serve as a thread in the presentation to shed light on the necessity to coordinate and have means to exchange amongst competence centres in Switzerland and beyond.

      Convener: Ioannis Xenarios (Vital-IT)
      Picture
      slides
    • 14:30 15:00
      Federating Clinical Data for Biomedical Research

      The protection of clinical data is of paramount importance but this can hinder its use in research due to ethical and legal constraints.
      This presentation will cover the construction of a federated database containing detailed clinical information that is amenable to remote analysis through a central server. The particularity of this system is that it allows complex analyses to be performed without any individual-level data being moved or copied from their original location.
      The technical challenges, lessons learned, limitations and opportunities gained by such an approach will be discussed.

      Convener: Iulian Dragan (Vital-IT)
      Picture
      slides
    • 15:00 15:15
      Coffee Break 15m
    • 15:15 15:45
      Streaming Genomes with MPEG-G the New ISO Standard for Genomics Data?

      This presentation introduces the essential features of MPEG-G, the emerging ISO standard for genome sequencing data compression, storage and transport. MPEG-G, is an ISO standardization initiative addressing the problems and limitations currently faced by applications relying on commonly used legacy formats.
      The standard is currently in its final development stage and it is planned to be published in early 2019. The objective is to provide compression and transport technologies yielding efficient, versatile and economical handling of sequencing data and associated information. Beside efficient compression and native transport capabilities, notable features include data streaming, selective access to compressed data, data aggregation, file annotation, conversion to legacy formats, enforcement of privacy rules, selective encryption of sequencing data and metadata. These features make MPEG-G the technology base for supporting the interoperability of many complex use cases and associated implementations.

      Convener: Claudio Alberti (GenomSys)
      slides
    • 15:45 16:15
      Panel Discussion: Security Considerations in Cloud and Using Bare Metal HPC Technologies

      Cloud and bare metal HPC technologies for computing, storage and networking are rapidly evolving and converging.

      In this panel, a group of experts and technical leaders would discuss key challenges and opportunities for the community of HPC services providers, i.e. hpc-ch. In particular, discussion will focus on the diverse nature of data sources and their management and control within a data centre environment.

      Panelists:
      - Christian Bolliger (ETH Zurich)
      - Guy-Maël Horclois (CSCS)
      - Daniele Passerone (Empa)
      - Heinz Stockinger (SIB Swiss Institute of Bioinformatics)

      Picture
      slides
    • 16:15 16:20
      Farewell and end of the meeting