HPDA system Kratos



The Kratos High-Performance Data Analysis Cluster (HPDA) is the primary working tool for the German Aerospace Center (Deutsches Zentrum für Luft- und Raumfahrt; DLR) Abteilung Earth Observation Data Science in Jena. With a computing power of 1.62 Petaflops and a storage capacity of 1.15 Petabytes, it supports interdisciplinary research questions in the fields of data management, data acquisition, data analysis, and intelligence, strengthening DLR’s data science expertise beyond the institute.
As an HPDA cluster, Kratos is specialized in processing large datasets, particularly with Deep Learning algorithms. The system was inaugurated in 2020 and expanded in 2021 with six additional GPU nodes featuring 48 NVIDIA A100 GPUs. Both the initial installation and the expansion, as well as ongoing support, are provided by Megware.
- 116 CCPU nodes with Intel® Xeon® Platinum processors (6,144 physical cores in total),
- 2 Fat nodes with 1.5 Terabytes of RAM each,
- 4 GPU nodes with 32 NVIDIA Tesla V100 GPUs,
- 6 GPU nodes with 48 NVIDIA Tesla A100 GPUs,
- a parallel file system with 1.15 Petabytes of usable storage and a high-speed network connection (Intel Omni-Path, 100 GBit/s).
The cluster is hosted at the University Computing Center (URZ) of Friedrich Schiller University Jena at Johannisfriedhof and is connected to the German Research Network (DFN) with a 1 GBit/s link. This enables access not only for DLR but also for cooperation partners such as other research institutions and SMEs.
For the DLR Institute of Data Science, Kratos supports addressing research questions in the areas of:
- Data Acquisition and Mobilization (DMO): Ensuring data quality, availability, and access through user-centric methods, secure software development, and multimodal information acquisition.
- Data Management and Enrichment (DMA): Development of methods for information extraction, interoperability of heterogeneous data, semantic models, and efficient data management.
- Data Analysis and Intelligence (DAI): Application of Machine Learning, process-knowledge-integrated data exploration, and Causal Inference to identify patterns and dependencies in large datasets.
Additionally, Kratos is utilized in the DLR Quantum Computing Initiative (QCI), e.g., in projects such as QCoKaIn (quantum-based Causal Inference for anomaly detection), Quant²AI (evaluation of quantum AI methods), and QCI-Connect (Quantum-as-a-Service platform).
Additional Information: Press release for the launch