3 October 2019
WSL Birmensdorf
Europe/Zurich timezone
hpc-ch forum

Session

Logs and Metrics Collection and Visualization at CSCS

3 Oct 2019, 10:35
Room Englersaal (WSL Birmensdorf )

Room Englersaal

WSL Birmensdorf

Zürcherstrasse 111 8903 Birmensdorf

Conveners

Logs and Metrics Collection and Visualization at CSCS

  • Dino Conciatore (CSCS)

Description

As the complexity of systems increases and the scale of these systems increases, the amount of system level data recorded increases.

Managing the vast amounts of log and metrics data is a challenge that CSCS solved with the introduction of a centralized log and metrics infrastructure based on Elasticsearch, Graylog, Kibana, and Grafana. This is a fundamental service at CSCS that provides easy correlation of events bridging the gap from the computation workload to nodes enabling failure diagnosis.

Currently, the Elasticsearch cluster at CSCS is handling more than 30'000'000'000 online documents (one year) and another 40'000'000'000 archived (two years). The integrated environment from logging and metrics to graphical representation enables powerful dashboards and monitoring displays.

Presentation materials

Building timetable...