Hoppa till huvudinnehåll

Kalendarium

11

June

Master's Thesis presentation: "BEV-Based Multi-Camera Detection for Sports Player Localization"

Tid: 2025-06-11 13:15 till 14:00 Seminarium

Authors: Hampus Sjöholm and Johanna Glatz Supervisors: Mikael Nilsson, Centre for Mathematical Sciences Håkan Ardö, Spiideo AB Daniel Milesson, Spiideo AB Examiner: Magnus Oskarsson, Centre for Mathematical Sciences

Abstract
Accurate detection of player positions is essential for sports analysis. Traditional single-camera methods generally suffer from occlusion and diminishing resolution far away. This work proposes using multi-camera Bird’s Eye View (BEV) detection to overcome these issues. Synthetically generated data was used to train and test the detection model, which uses images from six cameras placed around a handball court and pre-trained segmentation networks as feature extractors. Player masks are generated, projected to the ground plane, stacked and used as input into a detection network to retrieve player positions. A type of neural network called segmentation network was used to generate segmentation masks of the players. Two types of segmentation networks, Mask R-CNN and YOLOv11x-seg, were tested. The player locations were found from the projected segmentation masks using a detection network in the from of a U-Net, a neural network originally used as a segmentation network, but here adapted to produce heat maps from which the player locations are extracted. Two differently sized U-Nets were tested. Experiments evaluated the effect of varying camera count, placement, and input order. The results show that a model consisting of Mask R-CNN and a Deep U-Net outperform all other models, especially any single camera structures. Detection accuracy improves significantly with additional cameras, especially when increasing from 2 to 3 cameras, thereby going from 1 to 2 viewpoints. When shuffling the stacked images the performance decreased significantly, indicating that a model would benefit from using the camera calibration as input. Using multiple cameras in a BEV detector shows great promise in accurate player detection and with a more case-specific feature extractor the performance could likely be improved even more.



Om händelsen
Tid: 2025-06-11 13:15 till 14:00

Plats
MH:333

Kontakt
mikael.nilsson@math.lth.se

Sidansvarig: webbansvarig@math.lu.se | 2017-05-23