Hoppa till huvudinnehåll

Kalendarium

10

June

Deep Learning for Rotation Averaging in Structure from Motion

Tid: 2026-06-10 14:15 till 15:00 Examensarbete

Kandidatpresentation, Hampus Serneke

In this paper, we mainly investigate how different rotation representations affect
deep learning methods for solving the rotation averaging problem. We conclude
that the matrix representation with SVD+ projection disabled during training seems
to outperform both the quaternion and the rotation vector representations on synthetic data. We also demonstrate that disabling SVD+ projection during training
significantly improves the performance and the stability of the rotation matrix representation. Both of these results have a theoretical foundation in recent research, and
our results can be seen as an extension of the previous work to the rotation averaging
domain. Also, some popular methods for deep learning in rotation averaging use
quaternions, and we suggest that switching to matrix representations with SVD+
projection disabled during training might increase state-of-the-art performance.
We use two different learning architectures as a testbed for comparing the rotation representations: one naive architecture based on a fixed-size Feed-Forward
Neural network (FFN), and a more dynamic Graph Neural Network (GNN) architecture based on a Message Passing scheme. For the GNN architecture, we develop
three models: a naive Graph Convolution network (GCN), an improved GCN-EdgeMLP model, and a variant of a Graph Isomorphism Network (GIN). For the GNN
models, we resolve the gauge ambiguity by rotating the entire graph with respect to
an arbitrary node and mark that node’s initial feature vector. We demonstrate that
only the GIN model, when coupled with residual connections, is able to propagate
the gauge information throughout the entire graph.
For the FFN-based network, we train the model on a fully-connected graph with
five Haar-sampled rotations as nodes, with Langevin noise applied to each pairwise
rotation edge. The model is tasked with regressing the global orientation for each
node given the noisy pairwise rotations. First, a naive loss function based on the
difference between representation vectors is employed. We then implement two loss
functions based on the geometrical properties of the rotations: the chordal distance
and the geodesic distance. However, we encounter numerical instabilities with the
standard implementation of the geodesic loss, based on arccos, for angles close to
zero. We instead try an implementation using atan2, and achieve high numerical
precision around zero. The RoMa PyTorch library also provides an implementation
designed specifically for numerical precision using arcsin, which also handles zero
well. However, we show that their method is numerically imprecise close to pi,
which atan2 is not. The atan2 implementation is slower than arcsin though, so we
propose a hybrid method which only calls atan2 when needed. However, due to these
numerical concerns, we instead opt for the chordal distance in our experiments.
We compare the rotation matrix with SVD+ enabled with uncanonized quaternions and rotation vectors in the FFN model. We also hyperparameter-tune using
a TPE for each representation. We also study the effect of sampling rotations from
different distributions, and theorize that the Haar distribution might not be ideal
for SfM similarity. In general, the rotation vector performs worse than the two other
representations in the FFN model, especially for large rotations. We also train the
GIN model on graphs of varying size with noisy edges between some node pairs, and
observe that the rotation matrix without SVD+ achieves the highest performance.
 



Om händelsen
Tid: 2026-06-10 14:15 till 15:00

Plats
MH:333

Kontakt
carl [dot] olsson [at] math [dot] lth [dot] se

Sidansvarig: webbansvarig@math.lu.se | 2017-05-23