Collaborative Machine Learning for Data Value Creation (ColDa)
The 'Collaborative Machine Learning for Data Value Creation' (CoLDa) project is a three-year collaboration between the DLR Institute for AI Safety and Security and the Carl von Ossietzky University of Oldenburg. Together with the university, our institute is researching the practical development of federated machine learning. The aim is to enable the use of sensitive company data for AI applications while ensuring data protection and safeguarding business secrets. Conventional machine learning (ML) approaches require the centralisation of training data, which can be problematic in terms of data protection regulations or internal company security requirements. Federated learning offers a solution to this problem: rather than centralising the data, local models are trained directly at their respective locations, with only the model parameters being aggregated into a global model. Sensitive information always remains protected at its original location.
Contribution Institute for AI Safety and Security
In collaboration with the University of Oldenburg, the Institute for AI Safety and Security is promoting and coordinating the practical development of federated machine learning in two critical application areas. In the field of data integration, the project team is investigating how the previously labour-intensive process of linking heterogeneous data silos can be automated with the support of AI. To this end, the project team is developing a special process model to be implemented and evaluated as a prototype.
In parallel, the project is focusing on natural language processing (NLP) to recognise contradictions in domain-specific regulatory documents. Regulatory documents from various departments and locations are utilised for this purpose, ensuring that sensitive content, such as emails, internal reports and business documents, remains in its secure administrative location. This decentralised processing enables local vocabularies, sentence structures and contextual relationships to be learnt that are not considered in centrally developed models. Selected NLP classification tasks are implemented and evaluated as prototypes to assess the improvement in the quality of global models.
Institutes and facilities involved (DLR & external)
• DLR Institute for AI Safety and Security
• Carl von Ossietzky University, Oldenburg, Department Business Information Systems / Very Large Business Applications (VLBA).