edoc

PapyRow: A Dataset of Row Images from Ancient Greek Papyri for Writers Identification

Cilia, Nicole Dalia and De Stefano, Claudio and Fontanella, Francesco and Marthot-Santaniello, Isabelle and Scotto di Freca, Alessandra. (2021) PapyRow: A Dataset of Row Images from Ancient Greek Papyri for Writers Identification. In: Pattern Recognition. ICPR International Workshops and Challenges, ICPR 2021. pp. 223-234.

[img]
Preview
PDF - Accepted Version
5Mb

Official URL: https://edoc.unibas.ch/84475/

Downloads: Statistics Overview

Abstract

Papyrology is the discipline that studies texts written on ancient papyri. An important problem faced by papyrologists and, in general by paleographers, is to identify the writers, also known as scribes, who contributed to the drawing up of a manuscript. Traditionally, paleographers perform qualitative evaluations to distinguish the writers, and in recent years, these techniques have been combined with computer-based tools to automatically measure quantities such as height and width of letters, distances between characters, inclination angles, number and types of abbreviations, etc. Recently-emerged approaches in digital paleography combine powerful machine learning algorithms with high-quality digital images. Some of these approaches have been used for feature extraction, other to classify writers with machine learning algorithms or deep learning systems. However, traditional techniques require a preliminary feature engineering step that involves an expert in the field. For this reason, publishing a well-labeled dataset is always a challenge and a stimulus for the academic world as researchers can test their methods and then compare their results from the same starting point. In this paper, we propose a new dataset of handwriting on papyri for the task of writer identification. This dataset is derived directly from GRK-Papyri dataset and the samples are obtained with some enhancement image operation. This paper presents not only the details of the dataset but also the operation of resizing, rotation, background smoothing, and rows segmentation in order to overcome the difficulties posed by the image degradation of this dataset. It is prepared and made freely available for non-commercial research along with their confirmed ground-truth information related to the task of writer identification.
Faculties and Departments:04 Faculty of Humanities and Social Sciences > Departement Altertumswissenschaften > Fachbereich Alte Geschichte
04 Faculty of Humanities and Social Sciences > Departement Altertumswissenschaften > Fachbereich Alte Geschichte > Alte Geschichte (Huebner)
UniBasel Contributors:Marthot-Santaniello, Isabelle
Item Type:Conference or Workshop Item, refereed
Conference or workshop item Subtype:Conference Paper
Publisher:Springer
ISBN:978-3-030-68786-1
e-ISBN:978-3-030-68787-8
Series Name:Lecture Notes in Computer Science
Note:Publication type according to Uni Basel Research Database: Conference paper
Language:English
Identification Number:
edoc DOI:
Last Modified:31 Mar 2022 12:40
Deposited On:30 Mar 2022 09:09

Repository Staff Only: item control page