Corpus and representativeness Cover Image

Korpus a reprezentativnost
Corpus and representativeness

Author(s): Jan Chromý
Subject(s): Language and Literature Studies
Published by: AV ČR - Akademie věd České republiky - Ústav pro jazyk český
Keywords: corpus; representativeness; specialized corpora; population; inferential statistics

Summary/Abstract: This paper discusses the concept of representativeness in corpus linguistics. Representativeness is a concept used in empirical, quantitative science and it is a characteristic of the relationship between the sample and the population. It is argued that the population for the standard supposedly “representative” corpora of a whole language cannot be defined. The population could be reliably defined only for specialized corpora (e.g. corpora of newspaper texts), hence only this type of corpora could be truly statistically representative. The paper also discusses the idea that we could think about representativeness from the perspective of particular linguistic items instead of from the perspective of the whole language. It may be the case that the same corpus is representative for the use of one item and, at the same time, not representative for the use of another item.

  • Issue Year: 2014
  • Issue No: 4-5
  • Page Range: 185-193
  • Page Count: 9
  • Language: Czech