BelarusianGLUE: Analyzing Performance of Open-weight Models

BelarusianGLUE: Analyzing Performance of Open-weight Models
BelarusianGLUE: Analyzing Performance of Open-weight Models

Author(s): Maksim Aparovich, Volha Harytskaya, Vladislav Poritski, Oksana Volchek, Pavel Smrz
Subject(s): Language studies, Language and Literature Studies, Applied Linguistics, Computational linguistics, Eastern Slavic Languages, Philology
Published by: Институт за литература - БАН
Keywords: natural language processing; Belarusian language; large language models; language understanding evaluation

Summary/Abstract: We use BelarusianGLUE, a recently introduced benchmark, to analyze the performance of open-weight large language models (LLMs) on Belarusian language understanding tasks. The impact of prompting language, few-shot prompts, orthography (modern/classical/Latin), chat templates, and evaluation mode (discriminative/ generative) is investigated. Our findings suggest that more recent models generally perform better, but improvements are gradual. Fine-tuning on related Slavic languages doesn’t always improve Belarusian understanding. Classical orthography has limited impact, while latinization degrades performance. Analysis of specific tasks (sentiment analysis, Winograd schema challenge) reveals biases in the models, difficulties with understanding linguistic structure, and gaps in world knowledge and cultural context.

Details
Contents

Journal: Scripta & e-Scripta

Issue Year: 2025
Issue No: 25
Page Range: 25-38
Page Count: 14
Language: English

Content File-PDF

Back to list