Do Frequency Types Matter in Lexicography? Cover Image

Do Frequency Types Matter in Lexicography?
Do Frequency Types Matter in Lexicography?

Author(s): Marek Blahuš, Vojtěch Kovář, František Kovařík
Subject(s): Language studies, Electronic information storage and retrieval, Theoretical Linguistics, Applied Linguistics, Lexis, Computational linguistics, Western Slavic Languages
Published by: SAV - Slovenská akadémia vied - Jazykovedný ústav Ľudovíta Štúra Slovenskej akadémie vied
Keywords: corpus annotation; semi-automatic dictionary drafting; Dictionary Express; word frequency; frequency type; absolute frequency; document frequency; ALDF; ARF; Czech

Summary/Abstract: Word frequency in a corpus can be calculated in several different ways. Amongst the most common frequency types are the absolute frequency, the document frequency, ALDF and ARF. This paper focuses on comparing these four types in terms of “word correctness.” For determining whether a word is correct or not, we use the data gathered for the Czech lexicon used for the recent Czech Dictionary Express project. In this project, each of the top 100,000 most frequent headwords was reviewed by several Czech native speakers, who decided whether the word should be accepted or rejected or has some minor issues. The quality of the “word correctness” is further discussed in the paper.

  • Issue Year: 76/2025
  • Issue No: 1
  • Page Range: 303-311
  • Page Count: 9
  • Language: English
Toggle Accessibility Mode