Verteilte Korpusabfragesysteme
Date Issued
2009-01-01
Author(s)
Abstract
Distributed text corpora have not been very much in use so far. The Swiss
Text Corpus (CHTK) and its partner projects set up a distributed corpus
for German ("Korpus C4"), virtually merging parts of their corpus data and
making them available through one common query platform.
Based on experience made during this project, we propose a possible path
towards a more standardised interface for distributed corpus queries. This
should allow to integrate new as well as existing corpora more easily into
distributed corpus systems. Special attention is paid to problems such as
responsibility assignment, performance, user management, format
unification and metadata synchronisation.
Text Corpus (CHTK) and its partner projects set up a distributed corpus
for German ("Korpus C4"), virtually merging parts of their corpus data and
making them available through one common query platform.
Based on experience made during this project, we propose a possible path
towards a more standardised interface for distributed corpus queries. This
should allow to integrate new as well as existing corpora more easily into
distributed corpus systems. Special attention is paid to problems such as
responsibility assignment, performance, user management, format
unification and metadata synchronisation.
File(s)![Thumbnail Image]()
Loading...
Name
roth_linguistikonline_2009.pdf
Size
249.29 KB
Format
Adobe PDF
Checksum
(MD5):c627eae34f4e7be31a3a3395cac88f00