Polish-German Corpus of the University of Warsaw and Mainz Gutenberg University (Pol- GerCorp Cover Image

Korpus Polsko-Niemiecki Uniwersytetu Warszawskiego i Uniwersytetu Gutenberga (PolGerCorp)
Polish-German Corpus of the University of Warsaw and Mainz Gutenberg University (Pol- GerCorp

Author(s): Marek Łaziński, Andreas Meger, Michał Woźniak
Subject(s): Language and Literature Studies
Published by: Instytut Germanistyki Uniwersytetu Warszawskiego
Keywords: parallel corpora; Polish; German; corpus representativeness; verbal aspect

Summary/Abstract: The aim of the paper is to present the search possibilities in a new Polish-German corpus. The Polish-German / German-Polish Parallel Corpus (PolGerCorp), developed 2018–2021 under the auspices of the Universities of Mainz and Warsaw (research project “The Development of the Polish Aspect System in the Last 250 Years against the Background of Neighbouring Languages” (Beethoven II DFG/NCN)), includes about 10 million tokens in texts from 1750 to 2020, translated in both directions, and from various genres (fictional prose, non-fictional texts, press, legal texts). The texts are tagged, lemmatized and automatically sentence aligned. The article describes the structure and the practical work on the corpus and focuses on a new interface “for all” in the sense that it includes a graphical query builder as well as it allows the user to directly input sophisticated CQP queries.

  • Issue Year: 2022
  • Issue No: 16
  • Page Range: 379-390
  • Page Count: 21
  • Language: Polish
Toggle Accessibility Mode