Principles of Development of the Intonational Annotated Spoken Corpus Cover Image

Intonuoto garsyno kūrimo principai
Principles of Development of the Intonational Annotated Spoken Corpus

Author(s): Asta Kazlauskienė, Gailius Raškinis
Subject(s): Language and Literature Studies
Published by: Vytauto Didžiojo Universitetas
Keywords: speech corpus; intonation; spoken language; annotation

Summary/Abstract: The language manifests itself in both spoken and written forms. Spoken and written language forms are different in many linguistic respects and in the methods and tools they are acquired and analyzed. Intonation is one of the most important phenomena of a spoken language. It comprises the segmentation of speech into meaningful units, emphasis of key words, fluctuation in speech tempo, expression of emotions. Intonation is poorly represented by word orthography and thus poses many problems for intonation researchers. Specially prepared intonational speech corpus is a prerequisite for any serious intonation research. The process of compiling speech corpus can be divided into a few steps: a) acquisition and recording of prosody-rich utterances of spoken language b) description of the content of these utterances and utterance markup with tags that describe prosodic features c) automatically assigning timings to prosodic features (as a result of phone level annotation of intonational speech corpus). Every step in this process requires certain procedures to be observed and certain requirements to be met. Linguists of the world have built more than one intonational corpus, some common methodologies have been developed (this allows intonational features of different languages to be compared). This paper describes and discusses the process of building intonational annotated speech corpus of Lithuanian: tagging and labeling linguistic end extra-linguistic phenomena, cliticization, markup of phrase and sentence boundaries, determining and labeling logical stress, mark-up of fundamental frequency.

  • Issue Year: 15/2013
  • Issue No: 1
  • Page Range: 101-110
  • Page Count: 10
  • Language: Lithuanian