lu.se

Denna sida på svenska This page in English

Kalendarium

AI Lund Lunch seminar: Reading Älvsborg’s Ransome: How to turn 16th century hand-written tax records in to structured economic information

Illustration. Symbolic description of OCR-reading of old hand-written text

Seminarium

Tid: 2022-03-02 12:00 till 13:15
Plats: Online
Kontakt: Jonas [dot] Wisbrant [at] cs [dot] lth [dot] se
Spara händelsen till din kalender


Recording from the seminar can be accessed ai.lu.se

Title: Reading Älvsborg’s Ransome: How to turn 16th century hand-written tax records in to structured economic information

Speakers: 

  • Kerstin Enflo, Department of Economic History, Lund University
  • Christopher Blomqvist, project assistant, Mathematics, Lund University

When: 2 March at 12.00-13.15

Where: Online at the zoom platform

Spoken language: English

Abstract

We  present a system for extracting tab-ular information from loosely structured handwritten documents. The  system  consists  of  three parts, 

  • a u-net like CNN-basedmethod  for  text  detection  and  segmentation,  
  • an attention-based method for simultaneous text recognition and classification of word-parts, and  
  • a  method for matching the word parts into a tabular structure for each entry.

A key contribution is the observation that the attention-based recognition and classification module makes it possible for improved spatial analysis of the tabular information. The method is evaluated on a unique historical document: The Swedish Wealth Tax of 1571, consisting of 11,453 pages of hand-written tax records. The evaluation shows that the system provides a significant improvement to the state-of-the-art to the problem of tabular extraction from loosely structured historical documents. 

Related

Read about Älvsborg’s Ransome is Swedish at Wikipedia.org.