An Unsupervised Machine Learning Model for Automatic Syllabification  of Bulgarian Words Cover Image

Модел за автоматично сричкообразуване на български думи основан на ненадзиравано машинно самообучение
An Unsupervised Machine Learning Model for Automatic Syllabification of Bulgarian Words

Author(s): Krasen Penchev
Subject(s): Applied Linguistics, ICT Information and Communications Technologies
Published by: Съюз на учените - Варна
Keywords: syllabification; machine learning; automatic; unsupervised; model

Summary/Abstract: There are a lot of definitions of the syllable, and many discussions about it's role in the structure of the spoken languages. Some linguists put it in a central place in their theories. Having in mind that every person speaking a language, which is his/hers mother tongue, can divide the words into syllables, it could be concluded that the syllable is a structural entity of the spoken languages. The automatic syllabification, at least in theory, is applicable in a broad range of problems. Unfortunately it's not as popular as one would imagine. The small number and the low quality of the training resources are the main reasons for the low adoption rate of the automatic syllabification. A model for an unsupervised automatic syllabification is presented in this report. The aim is to design a general purpose model which would address the outlined existing problems of the automatic syllabification in the context of the Bulgarian language. The presented method is not constrained by the volume of the training data or the field of knowledge it’s coming from.

  • Issue Year: 7/2018
  • Issue No: 3
  • Page Range: 133-139
  • Page Count: 7
  • Language: Bulgarian