Inside Baseball: Coverage, quality, and culture in the Global WordNet Cover Image

Inside Baseball: Coverage, quality, and culture in the Global WordNet
Inside Baseball: Coverage, quality, and culture in the Global WordNet

Author(s): Martin Benjamin
Subject(s): Language studies, Media studies, Lexis, ICT Information and Communications Technologies
Published by: Instytut Slawistyki Polskiej Akademii Nauk
Keywords: wordnet; lexicography; vocabulary; named entities; multilingual;

Summary/Abstract: The Global WordNet is succeeding in producing relatively open linguistic data that is coordinated to a degree among numerous languages. The project has grown organically, with no overall plan or direction. The result is a certain amount of incoherence in determining what items should be treated in wordnets, and how the various wordnets should aspire to consistent quality. Using the example of terms related to baseball, which constitute a non-trivial portion of the Princeton WordNet, this paper discusses problems of coverage selection both for English and for other languages, as well as methods to improve quality and depth through public review of current content, and contribution of missing terms and definitions. It is proposed that proper names be removed entirely from WordNet and treated as a separate project, and that individual languages produce annexes of indigenous concepts that can be readily considered within sister projects as a supplement to the Anglo–American weighting of the current endeavor. To produce a consistent product that transmits inter-intelligible understanding at a high level across languages, it is proposed that an open committee of interested stakeholders convene to consider the project’s goals and develop a roadmap for how to achieve them.

  • Issue Year: 2018
  • Issue No: 18
  • Page Range: 1-14
  • Page Count: 14
  • Language: English