ODIX

A large part of the knowledge available at DLR is in the form of various technical documents. These range from protocols of various (large-scale) facilities to data sheets and other texts. However, these documents are primarily aimed at people and thus often elude automated processing. Particularly in view of the growing quantity and increasing heterogeneity of data sets, this prevents effective and efficient use of already generated knowledge for current and future research topics.
The aim of ODIX is to develop methods that process the knowledge contained in documents and other sources (e.g. measurement series) in such a way that it can be used directly in AI applications as well as comprehensively explored and analysed by humans. To this end, factual information is first extracted from the documents and annotated using semantic concepts, which also includes the creation of suitable automated interfaces for the extraction. The resulting knowledge graph is stored together with other collected data in the data management system shepard and linked to it. Finally, interfaces for both human and automated use of this now structured knowledge will be developed on this basis. The project will be concluded with a demonstration of the developed prototype. For this purpose, examples will be provided from the domain-specific institutes and the performance of the software will be evaluated. Further document types, data types and especially exploration possibilities can be part of future projects.
The methods and tools developed in ODIX will make a decisive contribution to the development and utilisation of an enormous treasure trove of data at DLR. The semantic annotation of the existing databases not only makes them directly accessible to the partners involved, but also opens them up for applications in other areas in accordance with the FAIR principles. In this way, ODIX is also positioning itself as an AI enabler in the context of the ongoing digitisation of DLR.