Identifying Polarity in Different Text Types Cover Image

Identifying Polarity in Different Text Types
Identifying Polarity in Different Text Types

Author(s): Hille Pajupuu, Rene Altrov, Jaan Pajupuu
Subject(s): Customs / Folklore, Cultural Anthropology / Ethnology, Culture and social structure
Published by: Eesti Kirjandusmuuseum
Keywords: lexicon-based approach; machine learning approach; Naïve Bayes; polarity; sentiment analysis; SVM; text types

Summary/Abstract: While Sentiment Analysis aims to identify the writer’s attitude toward individuals, events or topics, our aim is to predict the possible effect of a written text on the reader. For this purpose, we created an automatic identifier of the polarity of Estonian texts, which is independent of domain and of text type. Depending on the approach chosen – lexicon-based or machine learning – the identifier uses either a lexicon of words with a positive or negative connotation, or a text corpus where orthographic paragraphs have been annotated as positive, negative, neutral or mixed. Both approaches worked well, resulting in a nearly 75% accuracy on average. It was found that in some cases the results depend on the text type, notably, with sports texts the lexicon-based approach yielded a maximum accuracy of 80.3%, while over 88% was gained for opinion stories approached by machine learning.

  • Issue Year: 2016
  • Issue No: 64
  • Page Range: 125-142
  • Page Count: 18
  • Language: English