Design and Applications of MoncoPL as a Monitor Corpus of Polish Cover Image

Budowa i zastosowania korpusu monitorującego MoncoPL
Design and Applications of MoncoPL as a Monitor Corpus of Polish

Author(s): Piotr Pęzik
Subject(s): Language studies, Language and Literature Studies, Theoretical Linguistics, Applied Linguistics, Western Slavic Languages, Philology
Published by: Wydawnictwo Uniwersytetu Śląskiego
Keywords: MoncoPL; monitor corpus; Polish; diachronic corpora

Summary/Abstract: This paper introduces the methodology of compiling and maintaining MoncoPL, a large monitor corpus of web-based Polish. Furthermore, an overview of the search engine of the same name is provided to show how the size and composition of the corpus, currently reaching over 5.6 billion word tokens, facilitates research on distributional properties of rare words, neologisms and phraseological units. Finally, the article exemplifies some advantages of using a densely-sampled diachronic corpus for the purposes of observing frequency trends and cycles of various constructions in online media discourse.

  • Issue Year: 2020
  • Issue No: 7
  • Page Range: 133-150
  • Page Count: 18
  • Language: Polish