Data Management is essential to make research data easily accessible and usable. Important ingredients of data management include data policies and data workflows.
The data workflows are based on the policies which are implemented by defining a set of parameters stored in the metadata catalogue. The role of the metadata catalogue in relation to the data management services and underlying hardware solutions for the data storage systems will be presented. The architecture of the storage system consists of four layers, each addressing a different set of challenges. The first – online - is designed as a fast cache for the data generated directly at the scientific instruments during experiments. The second layer – offline - provides the performance for data processing during and after the beamtimes. The third layer - dCache disk pool - delivers the capacity to the system for long-term storage and the last one - tape archive - provides data safety and long-term archive. The storage system is able to accept 2PB/day of raw data, demonstrating the real capabilities with all sub-services being involved in this process. The storage system is connected to the high-performance computing cluster supporting remote data analysis and alternatively allows external users to export data outside of the European XFEL facility.
|Email address of presenting firstname.lastname@example.org|