Annotation of discourse phenomena in the Prague Dependency Treebank Cover Image
  • Price 4.90 €

Zachycení výstavby textu v Pražském závislostním korpusu
Annotation of discourse phenomena in the Prague Dependency Treebank

Author(s): Lucie Poláková, Jiří Mírovský, Pavlína Jínová, Šárka Zikánová, Eva Hajičová, Magdaléna Rysová, Anna Nedoluzhko
Subject(s): Language and Literature Studies
Published by: Ústav pro jazyk český Akademie věd České republiky
Keywords: text; discourse; phenomena beyond the sentence boundary; discourse relations; discourse connectives; coreference; bridging anaphora; Prague Dependency Treebank

Summary/Abstract: Language corpora annotation schemes cover various layers of sentence description nowadays – from morphology to semantics. Annotation projects concerning phenomena beyond the sentence boundaries, however, started to attract the attention of corpus linguists only recently. In the present contribution, we describe a unified approach to analysis of discourse phenomena, aimed and developed for a large-scale annotation of Czech empirical data of the Prague Dependency Treebank. This approach is based on two fundamental pillars: (i) it exploits the results of one of the first complex schemes for discourse annotation proposed and realized in the Penn Discourse Treebank for English; (ii) it follows the Praguian Functional Generative Description and treebanking tradition, taking advantage of the tectogrammatical (underlying) layer of sentence analysis and extending it to a full discourse-level description. Our analysis concentrates on two major aspects of discourse coherence: (i) on discourse relations (semantic relations between discourse segments) and discourse connectives as their lexical anchors; and (ii) on coreference and the so-called bridging anaphora. We present a detailed description of the annotation scheme and procedure, address individual problematic issues and offer basic corpus statistics and annotation evaluation.

  • Issue Year: 76/2015
  • Issue No: 3
  • Page Range: 163-197
  • Page Count: 35