Latvian FrameNet Cover Image

Latviešu valodas FrameNet korpuss
Latvian FrameNet

Author(s): Baiba Saulite, Gunta Nešpore-Bērzkalne, Laura Rituma, Viesturs Jūlijs Lasmanis, Normunds Grūzītis
Subject(s): Information Architecture, Lexis, Semantics, Baltic Languages
Published by: Latvijas Universitātes Literatūras, folkloras un mākslas institūts
Keywords: semantic annotation; frame semantics; semantic role; lexical unit; verb; deverbal derivative;

Summary/Abstract: This paper presents a FrameNet-annotated text corpus for Latvian language. Latvian FrameNet is a part of a FullStack corpus – medium-sized general-purpose multi-layered corpus, anchored in cross-lingual state-of-the-art syntactic and semantic representations: Universal Dependencies (UD), FrameNet and PropBank, as well as Abstract Meaning Representation (AMR). The FullStack has been designed considering the variety and balance of the corpus in terms of genres, domains, and lexical units. For annotating the FrameNet layer in this corpus, we use the latest frame inventory of Berkeley FrameNet, while the annotation itself is done on top of the underlying UD layer. Thus, the annotation of frames and frame elements is guided by the dependency structure of a sentence, instead of the phrase structure. We strictly follow a corpus-driven approach, meaning that lexical units (verbs and deverbal derivatives) in Latvian FrameNet are created only based on the annotated corpus examples. Currently, 570 Berkeley FrameNet frames have been used for semantic annotation of the Latvian FrameNet corpus, 2900 lexical units (average 5.1 lexical items per frame) and almost 26 000 usage examples (average 8.9 per lexical unit) have been tagged. To make this data available for linguistic research, the website of FrameNet-LV corpus has been created.

  • Issue Year: 2022
  • Issue No: 47
  • Page Range: 284-296
  • Page Count: 284
  • Language: Latvian