High-Definition(HD) mapping is in many applications from autonomous driving to infrastructure monitoring, and urban management essential for the understanding of complex urban infrastructure with centimeter-level accuracy. Aerial images provide valuable information over a large area instantaneously; nevertheless, no current dataset captures the complexity of aerial scenes at the level of granularity required by real-world applications. To address this, we introduce SkyScapes, an aerial image dataset with highly-accurate, fine-grained annotations for pixel-level semantic labeling. SkyScapes provides annotations for 31 semantic categories ranging from large structures, such as buildings, roads and vegetation, to fine details, such as 12 (sub-)categories of lane markings. DLR-SkyScapes has been used and published in . Please cite  if you use it in your work.
31 semantic categories have been annotated: low vegetation, paved road, non-paved road, paved parking place, non-paved parking place, bike-way, sidewalk, entrance/exit, danger area, building, car, trailer, van, truck, large truck, bus, clutter, impervious surface, tree, and 12 lane-marking types. The considered lane-markings are the following: dash-line, long-line, small dash-line, turn sign, plus sign, other signs, crosswalk, stop-line, zebra zone, no parking zone, parking zone, other lane-markings.
These 31 semantic classes introduce different challenges. Therefore, we have defined the following benchmarks: 1) SkyScapes-Dense with 20 classes as the lane-markings were merged into a single class, 2) SkyScapes-Lane with 13 classes comprising 12 lane-marking classes and a non-lane-marking one, 3) SkyScapes-Dense-Category with 11 merged classes comprising nature (low-vegetation, tree), driving-area (paved, non-paved), parking-area (paved, non-paved), human-area (bikeway, sidewalk, danger area), shared human and vehicle area (entrance/exit), road-feature (lane-marking), residential area (building), dynamic-vehicle (car, van, truck, large-truck, bus), static-vehicle (trailer), man-made surface (impervious surface), and others objects (clutter), 4) SkyScapes-Dense-Edge-Binar, for binary edge segmentation and 5) SkyScapesDense-Edge-Multi for multi-class edge segmentation.
We split the dataset into training, validation, and test sets with 50%, 12.5%, and 37.5% portions respectively. We chose this particular split due to the class imbalance and to avoid splitting larger images. The training and validation sets will be publicly available. Test images will be released as an online benchmark with undisclosed ground-truth.
The images were acquired by the German Aerospace Center (DLR) with airborne acquisition flights over several cities in Germany and several European countries. The data collection was carried out with a helicopter or aircraft, Germany using a low-cost camera array system consisting of three DSLR cameras mounted on a flexible platform for recording the data. Only the nadir-looking images were selected. In total, 16 non-overlapping RGB images of size 5616x3744 pixels were chosen. The flight altitude of about 1000m above ground led to a GSD of approximately 13cm/pixel. The images represent urban and partly rural areas with highways, first/second order roads, and complex traffic situations, such as crossings and congestion.
 S. Azimi, C. Henry, L. Sommer, A. Schaumann, and E. Vig, " Skyscapes -- Fine-Grained Semantic Understanding of Aerial Scenes," in International Conference on Computer Vision (ICCV), October 2019.
To access the dataset please contact us using the contact link below.