SynthBAD – Synthetic batch data generation for active learning and domain adaptation
The collection of extensive real data plays an essential role in the development of AI systems and is often associated with a high level of time and financial expenditure that is no longer economically justifiable. For this reason, there is an urgent need to generate the data required for AI training synthetically. However, there are still various challenges, such as the imbalance and redundancy of the training data. For example, the rare occurrence of certain critical events (e.g. near misses) must not lead to incorrect system behaviour. A frequently used technique for data generation is domain adaptation, with which existing data from a source domain (e.g. path and obstacle detection) is adapted to a specific target domain (e.g. path and obstacle detection in bad weather) without having to annotate this data again. Therefore, the aim of the project is to develop a tool chain to synthetically generate realistic camera data for applications in automated road traffic and robotics and to apply and demonstrate it.

links/left ©Rockstar Games (aus/from GTA) (Playing for Data: Ground Truth from Computer Games (tu-darmstadt.de)), rechts/right ©DLR
The project makes a research contribution to automated driving by modelling the tool chain on the sensors of the institute's ViewCar II and FASCar systems and thus generating synthetic data for AI training with our vehicles. The DLR Institute of Transportation Systems defines the requirements for such a tool chain and designs, implements and tests it.

Project title:
SynthBAD - Synthetic batch data generation for active learning and domain adaptation
Duration:
01/2023 to 12/2023
Project volume:
€ 165.902,26
Contracting authority:
DLR Transport Programme Directorate, German Aerospace Centre, Cologne
Project coordinator:
DLR Institute for AI Safety and Security
DLR institutes involved:
Institute of Robotics and Mechatronics
Institute of Transportation Systems
