May 18, 2017
Europe/Zurich timezone
hpc-ch forum

Data Science and Machine Learning have become relevant in many research areas and industries. The amount of collected data repeatedly breaks before known speed and volume barriers, which creates the need for automated data processing. Before automated data processing can take place, a machine or algorithm has to be trained for the intended task, which can be information search/retrieval, gaining insights or taking actions. The training phase might last long and occupy a large part of the available infrastructure. Especially if it has to be repeated on new incoming data. To minimize infrastructure costs, machine learning workloads tend to be offloaded to specialized hardware.

Machines are trained to take semantic action in specific domains. To act successfully in such a context, machine learning can't solely rely on data and general purpose algorithms, domain models play an important role in generating accurate results. Data Science - which can be seen as a combination of mathematics, heuristics and domain knowledge - helps discovering patterns and regularities in data, which ideally give birth to new models that help understanding the digitized ocean.

Key questions
  • What algorithms and computational models are best fit for machine learning at scale?
  • How scalable are the current implementations of support vector machines and deep neuronal networks?
  • Which type of hardware is a good match for offloading machine learning workloads: DSPs, ASICs, GPUs?
  • What deployment models and data flows are most supportive for machine learning applications?
  • How does machine learning impact resource usage in a HPC cluster?
  • What can Data Science adopt from HPC and vice versa?
Room: New York
T-Systems Kloten (Balsberg), Balz-Zimmermann-Strasse 7, CH-8302 Kloten