Hoppa till huvudinnehåll

Calender

05

June

X-jobb Kaan Nadir Sirin

Tid: 2025-06-05 10:15 till 11:15 Seminarium

Kaan Nadir Sirin presents his master’s thesis with title: Knowledge Distillation in Large Language Models: Sparse Logits, Style Transfer, and Low-Data Regimes

Abstract:

Large Language Models (LLMs) have achieved remarkable performance across a wide range of tasks but are often computationally prohibitive to deploy. Knowledge distillation offers a promising solution by transferring knowledge from large teacher models into smaller, more efficient student models. This thesis investigates logit-based distillation in decoder-only LLMs with a focus on practical and data-efficient methods. First, it examines how distillation design choices—such as KL divergence direction, temperature scaling, and sampling size—influence student model performance. Then, it introduces a novel sparse logit distillation pipeline enabling distillation in low-resource scenarios, including style transfer and domain adaptation without high-quality labeled datasets. In a style injection task, the proposed method reliably transferred a target persona (“sassy teenager”) using only 320 instruction samples, outperforming supervised fine-tuning. Finally, the method was applied to medical knowledge transfer from a 70B teacher to an 8B student, achieving superior response quality and improved accuracy. Results suggest that sparse logit distillation can enhance both efficiency and flexibility of knowledge transfer in LLMs, especially in underexplored or resource-constrained settings.

 

Examiner:

Niels Christian Overgaard, Centre for Mathematical Sciences, Lund University

 

Supervisors:

Alexandros Sopasakis, Centre for Mathematical Sciences, Lund University

Oskar Åström, Centre for Mathematical Sciences, Lund University

 



Om händelsen
Tid: 2025-06-05 10:15 till 11:15

Plats
MH: 333

Kontakt
alexandros.sopasakis@math.lth.se

Sidansvarig: webbansvarig@math.lu.se | 2016-06-20