Hoppa till huvudinnehåll

Kalendarium

04

December

Master's Thesis - Time-Shift Estimation and Audio Compression using Machine Learning

Tid: 2023-12-04 10:15 till 11:15 Seminarium

Time-Difference of Arrival (TDOA) measurements are important in several areas and have applications to microphone array calibration, speaker diarisation, beamforming, mapping, and positioning. This thesis studies whether using a transformer network could improve such measurements for a setup with a moving sound source. Limitations on the input size of transformer networks required us to reduce the dimensionality of the audio data. Here we studied how both linear models (PCA and FFT) and non-linear models (autoencoders) could be used for this task. In particular, we studied their performance for different data distributions and under varying grades of compression. A notable result was that the non-linear models outperform the linear ones on a data set consisting of sinusoidal waves with varying frequency. As for the transformer network, our results were not enough to draw conclusions regarding its viability. Further research is needed to see if these methods could form a viable solution to the time-estimation problem.

Ida Buhre och Johan Larsson presenterar sitt examensarbete

Måndagen den 4/12 kl 10:15 i MH:227

och på zoom 

https://lu-se.zoom.us/j/64626643697

Titel engelska:
Time-Shift Estimation and Audio Compression using Machine Learning

Titel svenska:
Tidsförskjutningsuppskattning och ljudkomprimering med maskininlärning

Abstract engelska:
Time-Difference of Arrival (TDOA) measurements are important in several areas and have applications to microphone array calibration, speaker diarisation, beamforming, mapping, and positioning. This thesis studies whether using a transformer network could improve such measurements for a setup with a moving sound source. Limitations on the input size of transformer networks required us to reduce the dimensionality of the audio data. Here we studied how both linear models (PCA and FFT) and non-linear models (autoencoders) could be used for this task. In particular, we studied their performance for different data distributions and under varying grades of compression. A notable result was that the non-linear models outperform the linear ones on a data set consisting of sinusoidal waves with varying frequency. As for the transformer network, our results were not enough to draw conclusions regarding its viability. Further research is needed to see if these methods could form a viable solution to the time-estimation problem.

Supervisors:
Kalle Åström, supervisor, Centre for Mathematical Sciences
Erik Tegler, co-supervisor, Centre for Mathematical Sciences

Examiner:
Mikael Nilsson, examiner, Centre for Mathematical Sciences

 

 

 

 



Om händelsen
Tid: 2023-12-04 10:15 till 11:15

Plats
MH:227

Kontakt
karl [dot] astrom [at] math [dot] lth [dot] se

Sidansvarig: webbansvarig@math.lu.se | 2017-05-23