DLR (CC BY-NC-ND 3.0).
Guided by the vision of a seamlessly digital integrated chain of highly varying processes, the shepard system (storage for heterogeneous product and research data) is being developed at Augsburgs Center for Lightweight Production Technology. One focal point is the interdisciplinary utilisation of all generated data, e.g. for AI methods for data analysis or contextual curation of data.
Shepard is a scalable system for a highly flexible automated storage and linking of heterogeneous data (including measuring data, simulation results, CAD data) and related metadata (e.g. provenance information or semantic classification of the data) along most varying real and digital process chains. It is intended to provide a simple and sustainable way for all users for storing, retrieving, analyzing and sharing research data, enabling comprehensive collaboration and thereby representing the basis for a consistent research data management from experiment to publication. Due to development and prototypical use for a structured acquisition of experiments in a wide variety of disciplines (from virtual simulation workflows, through production technology up to flight experiments or optical test range) the system already covers a large number of domains in the research context, especially at the common fields of the DLR.
Simple connection options via standardised interfaces enable the automated recording of data including an annotation with meta information. These interfaces are also used for evaluation and provide the basis for connecting any AI framework. The provided web interface enables a comfortable use of shepard's basic functions. More complex applications can easily be connected via the provided REST API. The basic architecture of shepard includes the linking of different existing databases, optimizing storage and linking of highly heterogeneous data sets. Due to the consistent use of open source technologies, a vendor lock-in is successfully avoided and the system can be operated free of charge. However, many used components offer corresponding enterprise licensing models ensuring a long-term scalability.
An extension of the functions including more complex search queries across content, visualizations and the connection of internal and external tools can shortly be expected.
Shepard has been published on Gitlab (https://gitlab.com/dlr-shepard) using the Apache 2.0 license, warmly welcoming an active contribution of external participants. The procedure is opening up a broad community providing detailed feedback and further continuous development, while contributing to the digital transformation of science.