Calender
16
January
Master Thesis Presentation - Vidar Tobrand
Vidar Tobrand presents his master thesis "Financial KPI Extraction using Large Language Models"
Examinator: Carl Olsson
Advisors: Alexandros Sopasakis and Oskar Åström
Abstract. One of many manual tasks in the banking sector is extracting financial Key Performance Indexes (KPIs) from annual reports. This can be automated with Machine Learning (ML) techniques such as Question Answering (QA), Named Entity Recognition (NER), Entity Relation (ER), Text To Sql, or Retrieval Augmented Generation (RAG). This thesis evaluates how different models perform on extracting KPIs from annual reports, both through litterature review and training custom models. It researches if a custom trained model is performing better than current State-Of-The-Art (SOTA) models. The thesis show that a custom trained extractive model yield an Exact Match (EM) of 27.9 - 53.9 %. The result is based on a training dataset created from annual reports in XHTML format and a test dataset based on the same annual reports in PDF format. It compares the extractive custom trained model to SOTA models such as ChatGPT and GPT-5-nano, the latter score an EM of 58 %. However, regarding the generative models, the results should be seen as a lower limit due to minimal effort given towards prompt engineering and fine-tuning. During training, given the correct context, the custom trained extractive model achieves 95 % EM on the validation set. This shows high potential and indicates that an improval in retrieving the correct context can significantly increase the models end-to-end performance. The thesis also compares the suitability of different model architectures when extracting financial KPIs from annual reports. Lastly, the thesis includes the creation of a new dataset for KPI extraction of annual reports.
Om händelsen
Tid:
2026-01-16 13:15
till
14:15
Plats
MH:330
Kontakt
alexandros [dot] sopasakis [at] math [dot] lth [dot] se