Place of statistics in a language model Cover Image

Statistika koht keelemudelis
Place of statistics in a language model

Author(s): Heiki-Jaan Kaalep
Subject(s): Morphology, Computational linguistics, Methodology and research technology
Published by: SA Kultuurileht
Keywords: morphology; corpus linguistics; linguistic variation, text statistics;

Summary/Abstract: The article speculates on how quantitative data may fit into a theoretical model of language. It argues that the language model should include an idea about the generation procedure at play, albeit a speculative one. A concrete example shows how quantitative data form an integral part of a model of Estonian morphology, another concrete example shows how corpus-based statistical models may result in dubious statistical calculations, and two descriptions of old experiments in statistical learning show a potential path worth following in corpus linguistics in the future: one should pay more attention to some not-so-obvious features that play a role in human language learning, namely, transitional probabilities and linguistic units that should be left out from computations.

  • Issue Year: LXI/2018
  • Issue No: 08-09
  • Page Range: 713-727
  • Page Count: 15
  • Language: Estonian