Corpus of spoken Estonian and human-computer interaction Cover Image

Suulise eesti keele korpus ja inimese suhtlus arvutiga
Corpus of spoken Estonian and human-computer interaction

Author(s): Mare Koit, Riina Kasterpalu, Andriela Rääbis, Olga Gerassimenko, Tiit Hennoste, Krista Strandson
Subject(s): Language and Literature Studies
Published by: Eesti Rakenduslingvistika Ühing (ERÜ)
Keywords: corpus of spoken language; dialogue corpus; transcription; dialogue acts; annotation; spoken interaction; Estonian

Summary/Abstract: We argue for the necessity of studying human-human spoken conversations of various kinds in order to create user interfaces to databases. An efficient humancomputer dialogue system benefits from a well-organized corpus that can be used for investigating the strategies people use in conversations in order to be efficient and to handle the problems of spoken communication. For modelling natural behaviour and for testing the model we need a dialogue corpus where the roles of participants are close to the roles of a dialogue system and its user. For creating a user interface the corpus of one institutional conversation type is insufficient, since we need to know what phenomena are inherent to spoken language in general, what means are used only in certain types of conversations and what the differences are. For that reason, we collect and investigate the Corpus of Spoken Estonian and the Estonian Dialogue Corpus (a subcorpus of the former) as sources for investigating human-human interaction. The transcription conventions and annotation typology of spoken human-human dialogues in Estonian are introduced. Application of the Estonian Dialogue Corpus for investigating formal and functional characteristics of requests in information dialogues is presented.

  • Issue Year: 2009
  • Issue No: 5
  • Page Range: 111-130
  • Page Count: 20
  • Language: Estonian