Extraction and Presentation of Bilingual Correspondences from Slovak-Bulgarian Parallel Corpus Cover Image

Extraction and Presentation of Bilingual Correspondences from Slovak-Bulgarian Parallel Corpus
Extraction and Presentation of Bilingual Correspondences from Slovak-Bulgarian Parallel Corpus

Author(s): Radovan Garabík, Ludmila Dimitrova
Subject(s): Language and Literature Studies, Theoretical Linguistics, Comparative Linguistics
Published by: Instytut Slawistyki Polskiej Akademii Nauk
Keywords: translation equivalents; GIZA++; parallel corpora; aligned text; Slovak; Bulgarian

Summary/Abstract: In this paper the results of the automatic extraction and presentation of bilingual correspondences from Slovak-Bulgarian Parallel corpus are described. The equivalent phrases are extracted from sentence and word level automatically aligned corpus, filtered, indexed and presented in a dictionary-like interface. The bilingual dictionary database contains 80 thousand phrase pairs consisting of approximately 350 thousand words (per each language). Counting unique word forms, the size is 31 thousand in the Slovak part of the dictionary, 26 thousand in the Bulgarian part.

  • Issue Year: 2015
  • Issue No: 15
  • Page Range: 327-334
  • Page Count: 8
  • Language: English