The Structure of the Khakas Word Form and Restrictions on the Compatibility of Affixes in the Automatic Parser for the Khakas Language Cover Image

Структура хакасской словоформы и ограничения на сочетаемость аффиксов в автоматическом парсере хакасского языка
The Structure of the Khakas Word Form and Restrictions on the Compatibility of Affixes in the Automatic Parser for the Khakas Language

Author(s): Anna Vladimirovna Dybo, Vera S. Maltseva, Elvira V. Sultrekova, Aleksandra V. Sheimovich, Philip S. Krylov
Subject(s): Morphology, Syntax, Turkic languages
Published by: Институт языкознания Российской академии наук
Keywords: Turkic languages; Khakas language; corpus linguistics; morphology; grammar; automatic language processing;

Summary/Abstract: This article is a continuation of [Dybo et al. 2019]. The paper is devoted to the description of morphological rules operating in the automatic parser of the Khakas language (https://khakas.altaica.ru/grammar/). When working with large volumes of actual linguistic material, any automatic model of morphological analysis, which does not include restrictions on the compatibility of morphemes, will come up with a large number of incorrect analyses, while the Turkic languages already show an extremely high amount of correct homonymous analyses. The article presents not the analysis algorithm that is used in the parser code, but a list of rules written in natural language that underlies the operation of the algorithm. In the explanation of each rule, we give examples illustrating it and explain the need for such a restriction on the behavior of affixes. They have a different status. Some of the restrictions are due to the fundamental structure of the Turkic word form or the incompatibility of the semantic characteristics of morphemes. The other part is a consequence of the fact that the automaton is aimed at practical application, the analysis of texts that fell into the Khakas Language Corpus. The article begins with a scheme of the Khakas word form and a brief description of the inflectional affixes of the Khakas language. Some of the Khakas dialect morphemes found in the course of field work are described for the first time. Some diachronic processes in Khakas are also discussed.

  • Issue Year: 2023
  • Issue No: 02 (49)
  • Page Range: 42-75
  • Page Count: 34
  • Language: Russian