Whether concert, rally, demonstration or trade fair, so that safety can be ensured at large public events – and in times of a pandemic also the health of those present – it is important to know accurately how many people have assembled. This year a new aspect was added: protection from infection. In order to guarantee safety also in this wider sense, knowing precise numbers has increased in relevance.
Experts in photogrammetry and image analysis at the German Aerospace Center (DLR) have designed a 'learning algorithm' that automatically counts the number of people on photos and videos. The special advantage: the algorithm can be used on a wide variety of information sources and quality levels. Applying artificial intelligence methodologies, the system was trained to detect edges and shapes that are typical of people on pictures. And do it in real time and respecting data protection regulations, which is possible because it cannot recognize people, but only count or record their presence.
Reliable automatic estimates – also as conditions change
"We most recently analysed a YouTube video that someone recorded on his cell phone during a mass rally. Some of the images were rather blurred. The algorithm is tolerant as far as the quality of the source material is concerned. The images do not have to have particularly high resolution or be taken from a stationary location", explains Dr. Reza Bahmanyar of the DLR Remote Sensing Technology Institute. "The system’s preliminary estimate of the size of a crowd was already astonishingly close to the police estimate. We still have a high error ratio, of course, but as we refine the system that should go down considerably."
It does not matter if the lighting changes, or the camera position, image quality or camera angle: the system even supplies relatively precise estimates in cases of very rapidly changing image parameters. From one time to the next, using graphic indicators the AI-based system increases how precisely it records people and estimates how many of them are present – without identifying them. "We only record statistical values, not images or personal data. Privacy is not compromised with our method ", stresses DLR expert Bahmanyar.
In the case of cell phone videos, the team manually specifies the velocity of the mass of people, the so-called 'region of interest' in the image, in other words the segment of the source material used for the analysis, and the scan rate. The scan rate depends on the velocity of the flow of people and is chosen so that the individual images analysed by the system will show different people each time. This assures the researchers that they are not counting people twice.
The whole procedure can be further automated if the camera parameters are known in advance. The choice of image segment and scan rate can be automatic if the images are recorded by permanently installed overview cameras. In addition, as the angle increases between the camera’s line of sight and the direction the crowd is moving, there is less overlapping of individuals in the image, so the estimate becomes more accurate. Likewise, the higher the image quality, the better the count. Since the system works when the crowd as well as the camera are moving, it is also possible to analyse images from flying platforms like camera drones.
The team at DLR’s Remote Sensing Technology Institute has trained the algorithm to record people. But since the system is based on a deep neural network it could also be applied in a wide range of other areas. Only time for the training and a number of training rounds are needed to teach the AI-algorithm to recognize other features. For example, it could be used for quality assurance by detecting surface defects, or for environmental monitoring by estimating the number of particles like plankton on underwater images.