A Modification of the Leacock-Chodorow Measure of the Semantic Relatedness of Concepts Cover Image

A Modification of the Leacock-Chodorow Measure of the Semantic Relatedness of Concepts
A Modification of the Leacock-Chodorow Measure of the Semantic Relatedness of Concepts

Author(s): Jerzy Korzeniewski
Subject(s): Language studies, Methodology and research technology
Published by: Wydawnictwo Uniwersytetu Łódzkiego
Keywords: text mining; WordNet network; semantic relatedness;Lecock-Chodorov measure

Summary/Abstract: The measures of the semantic relatedness of concepts can be categorised into two types: knowledge‑based methods and corpus‑based methods. Knowledge‑based techniques make use of man‑created dictionaries, thesauruses and other artefacts as a source of knowledge. Corpus‑based techniques assess the semantic similarity of two concepts making use of large corpora of text documents. Some researchers claim that knowledge‑based measures outperform corpus‑based ones, but it is much more important to observe that the latter ones are heavily corpus dependent. In this article, we propose to modify the best WordNet‑based method of assessing semantic relatedness, i.e. the Leacock‑Chodorow measure. This measure has proven to be the best in several studies and has a very simple formula. We asses our proposal on the basis of two popular benchmark sets of pairs of concepts, i.e. the Ruben‑Goodenough set of 65 pairs of concepts and the Fickelstein set of 353 pairs of terms. The results prove that our proposal outperforms the traditional Leacock‑Chodorow measure.

  • Issue Year: 6/2020
  • Issue No: 351
  • Page Range: 97-106
  • Page Count: 10
  • Language: English