Kalendarium
04
December
Master's Thesis - Time-Shift Estimation and Audio Compression using Machine Learning
Time-Difference of Arrival (TDOA) measurements are important in several areas and have applications to microphone array calibration, speaker diarisation, beamforming, mapping, and positioning. This thesis studies whether using a transformer network could improve such measurements for a setup with a moving sound source. Limitations on the input size of transformer networks required us to reduce the dimensionality of the audio data. Here we studied how both linear models (PCA and FFT) and non-linear models (autoencoders) could be used for this task. In particular, we studied their performance for different data distributions and under varying grades of compression. A notable result was that the non-linear models outperform the linear ones on a data set consisting of sinusoidal waves with varying frequency. As for the transformer network, our results were not enough to draw conclusions regarding its viability. Further research is needed to see if these methods could form a viable solution to the time-estimation problem.
Ida Buhre och Johan Larsson presenterar sitt examensarbete
Måndagen den 4/12 kl 10:15 i MH:227
och på zoom
https://lu-se.zoom.us/j/64626643697
Titel engelska:
Time-Shift Estimation and Audio Compression using Machine Learning
Titel svenska:
Tidsförskjutningsuppskattning och ljudkomprimering med maskininlärning
Abstract engelska:
Time-Difference of Arrival (TDOA) measurements are important in several areas and have applications to microphone array calibration, speaker diarisation, beamforming, mapping, and positioning. This thesis studies whether using a transformer network could improve such measurements for a setup with a moving sound source. Limitations on the input size of transformer networks required us to reduce the dimensionality of the audio data. Here we studied how both linear models (PCA and FFT) and non-linear models (autoencoders) could be used for this task. In particular, we studied their performance for different data distributions and under varying grades of compression. A notable result was that the non-linear models outperform the linear ones on a data set consisting of sinusoidal waves with varying frequency. As for the transformer network, our results were not enough to draw conclusions regarding its viability. Further research is needed to see if these methods could form a viable solution to the time-estimation problem.
Supervisors:
Kalle Åström, supervisor, Centre for Mathematical Sciences
Erik Tegler, co-supervisor, Centre for Mathematical Sciences
Examiner:
Mikael Nilsson, examiner, Centre for Mathematical Sciences
Om händelsen
Tid:
2023-12-04 10:15
till
11:15
Plats
MH:227
Kontakt
karl [dot] astrom [at] math [dot] lth [dot] se