edoc

Multimodal analysis and retrieval for fragmentary historical documents with machine learning: a case study on ancient Egyptian hieratic papyri

Unter, Stephan Maximilian. Multimodal analysis and retrieval for fragmentary historical documents with machine learning: a case study on ancient Egyptian hieratic papyri. 2025, Doctoral Thesis, University of Basel, Faculty of Science.

[img]
Preview
PDF
86Mb

Official URL: https://edoc.unibas.ch/96963/

Downloads: Statistics Overview

Abstract

This thesis deals with the development and application of machine learning methods for the analysis of a corpus of ancient Egyptian papyri from Deir el-Medina in Egypt. The highly fragmented nature of this corpus makes it impossible to search for matches based on the external contours of the fragments. Instead, this study uses a multimodal approach that integrates a variety of visual and textual criteria to calculate similarities and establish connections between fragments. These include colour, surface texture, handwriting characteristics, and the genre of the written texts. The heterogeneous nature of the material examined presents an additional challenge, as even sections of the same original document may contain texts of different content or may have been written by different scribes. Consequently, the above criteria are considered independently of each other as far as possible.
For each criterion, a method is developed to embed a fragment in a corresponding feature space based on suitably selected data or samples. The positioning of the embeddings can then be used via a distance metric to define the similarity between fragments with respect to a given criterion. This allows a to generate a sorted result list for any given query object.
The individual feature spaces are integrated into a combined approach that allows the data to be sorted on the basis of one or more potentially weighted aspects. This methodology facilitates the formulation of controlled similarities based on comprehensible criteria, thus improving the interpretability of the results compared to a holistic end-to-end approach. In addition, the modular nature of this approach allows new aspects to be easily incorporated or individual models to be replaced by more advanced variants as they become available.
Advisors:Roth, Volker
Committee Members:Vetter, Thomas and Fischer, Andreas
Faculties and Departments:05 Faculty of Science > Departement Mathematik und Informatik > Informatik > Biomedical Data Analysis (Roth)
UniBasel Contributors:Roth, Volker and Vetter, Thomas
Item Type:Thesis
Thesis Subtype:Doctoral Thesis
Thesis no:15657
Thesis status:Complete
Number of Pages:vi, 134
Language:English
Identification Number:
  • urn: urn:nbn:ch:bel-bau-diss156573
edoc DOI:
Last Modified:22 Mar 2025 05:30
Deposited On:21 Mar 2025 11:30

Repository Staff Only: item control page