Hoppa till huvudinnehåll

Calender

04

June

Han Fu & Jialu Xu Thesis presentation

Tid: 2025-06-04 10:15 till 11:00 Seminarium

Han Fu & Jialu Xu will present their thesis "Unified Pre-Training for Multi-Modal Sensor Data in Autonomous Driving" (conducted in collaboration with Zenseact)

Abstract:

Accurate, efficient and transferable scene representations are central to autonomous driving. Modelling the environment with learnable three-dimensional Gaussians offers a favourable trade-off between memory footprint and geometric fidelity: each Gaussian stores local occupancy and semantic context in a continuous, object-centric form. This thesis turns the 3D Gaussian into a unified pre-training framework for downstream perception. Building on GaussianFormer, we add a lightweight sensor-guided initialization that merges evidence from lidar and radar at the beginning of training, and cross-attention mechanism which helps extract geometric cues from distance signals. A backbone trained with this strategy can be frozen and reused as a feature extractor for tasks beyond its original objective. 

To test transferability, the trained backbone is frozen and paired with a simple decoder that receives only the Gaussian field and learns bird’s-eye-view(BEV) vehicle segmentation. With only a quarter of the labelled data, the decoder outperforms the camera-only baseline—and the modified backbone also delivers a measurable gain on standard 3D occupancy benchmarks. These findings position the 3D Gaussian representation as a versatile foundation for multi-modal perception: a single pre-trained representation can now be shared across semantic segmentation.



Om händelsen
Tid: 2025-06-04 10:15 till 11:00

Plats
MH:309A

Kontakt
Magnus.oskarsson@math.lth.se

Sidansvarig: webbansvarig@math.lu.se | 2016-06-20