The team works on developing vision algorithms for earth observation data that go beyond perception. In particular, this comprises methods like combining image and text information, exploiting the spatial relations of objects, and drawing conclusions from temporal developments and trends. Currently, the team is conducting research in the following areas:
Vision-Language Reasoning
Most current AI models for earth observation are trained to perform a single task, like classifying a scene into a pre-defined number of classes or detecting objects of pre-defined classes in an image. In order to address this limitation, the team is developing models that can answer natural language queries. It is our vision that this approach will allow non-experts to carry out AI-based analyses on earth observation data by communicating with the models through simple text prompts.
Spatial Reasoning
By exploiting the spatial relationships between objects and areas in earth observation imagery, it is possible to greatly reduce the number of labels needed for the training of neural network models. Conventional deep learning models for the segmentation of remote sensing imagery require a ground truth label for every single pixel in the training dataset. Through the use of spatial reasoning strategies, it is possible to greatly reduce the labelling effort needed, so that the annotator can simply draw single dots, scribbles or polygons instead of annotating all pixels (see figure).
Temporal Reasoning
Sometimes, not only spatial relationships matter, but also how things evolve over time. In temporal reasoning, these connections are taken into account.