Using Web Corpora for the Recognition of Regional Variation in Standard German Collocations

Roth, Tobias. (2012) Using Web Corpora for the Recognition of Regional Variation in Standard German Collocations. In: Proceedings of the Seventh Web as Corpus Workshop (WAC7). pp. 31-38.

[img] PDF - Published Version
Restricted to Repository staff only


Official URL: http://edoc.unibas.ch/47562/

Downloads: Statistics Overview


The Standard German collocation berührende Worte 'touching words' is only used in Austria. Yet, it consists of exclusively Common German component words. It is the combination that makes it a regional variant — one on a purely collocational level.
One of the goals in our German collocations dictionary project is to describe regional variation in the collocations collected. The main tool for this is a web corpus divided into three subcorpora. Each of them contains web pages from a different country code top-level domain (Austria, Germany and Switzerland).
Our method of variation recognition using a web corpus is compared to previously used resources. Advantages, disadvantages and results of the web-corpus approach are discussed. Given that the extent of regional variation in German collocations and its structuring into collocational, lexical and syntactical variation has been unknown, we present first estimates in that respect.
Faculties and Departments:04 Faculty of Humanities and Social Sciences > Departement Sprach- und Literaturwissenschaften > Ehemalige Einheiten Sprach- und Literaturwissenschaften > Deutsche Sprachwissenschaft (Häcki Buhofer)
UniBasel Contributors:Roth, Tobias
Item Type:Conference or Workshop Item, refereed
Conference or workshop item Subtype:Conference Paper
Publisher:ACL SIGWAC
Note:Publication type according to Uni Basel Research Database: Conference paper
edoc DOI:
Last Modified:16 Mar 2017 07:21
Deposited On:09 Mar 2017 13:48

Repository Staff Only: item control page