Persistent semantic identity in WordNet Cover Image

Persistent semantic identity in WordNet
Persistent semantic identity in WordNet

Author(s): Eric Kafe
Subject(s): Lexis, Semantics, Transformation Period (1990 - 2010), Present Times (2010 - today)
Published by: Instytut Slawistyki Polskiej Akademii Nauk
Keywords: wordnets; semantic identifiers; sense keys; key violations; synsets; mappings;

Summary/Abstract: Although rarely studied, the persistence of semantic identity in the WordNet lexical database is crucial for the interoperability of all the resources that use WordNet data. The present study investigates the stability of the two primary entities of the WordNet database (the word senses and the synonym sets), by following their respective identifiers (the sense keys and the synset offsets) across all the versions released between 1995 and 2012, while also considering drifts of identical definitions and semantic relations. Contrary to expectations, 94.4% of the WordNet 1.5 synsets still persisted in the latest 2012 version, compared to only 89.1% of the corresponding sense keys. Meanwhile, the splits and merges between synonym sets remained few and simple. These results are presented in tables that allow to estimate the lexicographic effort needed for updating WordNet-based resources to newer WordNet versions. We discuss the specific challenges faced by both the dominant synset-based mapping paradigm (a moderate amount of split synsets), and the recommended sense key-based approach (very few identity violations), and conclude that stable synset identifiers are viable, but need to be complemented by stable sense keys in order to adequately handle the split synonym sets.

  • Issue Year: 2018
  • Issue No: 18
  • Page Range: 1-20
  • Page Count: 20
  • Language: English