Lightweight multi-drone detection and 3D-localization via YOLO

Summary: Researchers from Indian Institute of Technology Kanpur recently evaluated a method to perform real-time multiple drone detection and three-dimensional localization using state-of-the-art tiny-YOLOv4 object detection algorithm and stereo triangulation. They also released the source code for the project, with pre-trained models and the curated synthetic stereo dataset to further advance this research across the community.

Challenge: Unmanned Aerial Vehicles (UAVs) have gained massive popularity in recent years, owing to the advancements in technology and surge in the number of use cases for UAVs which include traffic management, security and surveillance, supply of essentials, disaster management, warehouse operations etc. Drones were initially a military, surveillance and security tool. Early versions of the drone were much larger, but as time progressed, they got smaller and smarter.

Consequently with the development of small and agile drones, their applications have time and again raised security concerns. Their increasing use in swarm systems have also sparked another research direction in dynamic detection and localization of multiple drones in such systems, especially for counter-drone systems.

Progressively, deep learning based solutions for detecting drones have improved at the task of object detection, but have also grown bulkier and have relied heavily on bulky computing power.

Finding: Their computer vision approach was able to eliminate the need for computationally expensive stereo matching algorithms, thereby significantly reducing the memory footprint and making it deployable on embedded systems.

Their drone detection system was highly modular (with support for various detection algorithms) and capable of identifying multiple drones in a system, with real-time detection accuracy of up to 77% with an average FPS of 332 (on Nvidia Titan Xp). They also tested the complete pipeline in AirSim environment, detecting drones at a maximum distance of 8 meters, with a mean error of 23% of the distance.

The researchers found that the modern, neural net based tiny-YOLO v4 algorithm attained higher frame rates and detection accuracy results than leading CPU based algorithms, and coupled with their classical stereo triangulation based depth estimation module, can be used for 3D localization.

How Labelbox was used: The Labelbox platform was used as the primary annotation tool of choice, enriching the researchers' dataset with images containing multiple drones. Other than the images of drones, the dataset also contains images of non-drone, drone-like “negative” objects, as to avoid their model from overfitting. The dataset contains 5529 images along with annotated files corresponding to each image, containing parameters of bounding box such as height, width, center x, y coordinates, and object class.

You can read the full PDF here.