Comparison of spoken corpora: really just a matter of perspective? Cover Image

Srovnání mluvených korpusů: skutečně jen odlišná hlediska?
Comparison of spoken corpora: really just a matter of perspective?

Author(s): Michal Křen, Martina Waclawičová
Subject(s): Applied Linguistics
Published by: AV ČR - Akademie věd České republiky - Ústav pro jazyk český
Keywords: corpus design; spoken Czech; metadata; regional coverage; representativeness

Summary/Abstract: Recently, more attention has been paid to the issues of corpus design and representativeness. These issues are especially important for general-purpose language corpora such as the spoken corpora developed within the framework of the Czech National Corpus. This text is a response to Jan Chromý’s paper “Comparison of spoken corpora from a sociolinguistic perspective” (Slovo a slovesnost 78, 2017: 145‒158), in which the author compares the general-purpose spoken corpus ORAL2013 with his own dataset collected for the SAUP project. We argue that some of his claims are not justified by the findings presented in the paper and that his understanding of the concept of representativeness is rather misleading. Therefore, we aim to clarify some fundamental design decisions adopted for the compilation of ORAL2013 by responding to the specific objections raised by Chromý. We also point out some methodological and reasoning inconsistencies in his paper.

  • Issue Year: 80/2019
  • Issue No: 2
  • Page Range: 128-139
  • Page Count: 12
  • Language: Czech