Style-Markers in Authorship Attribution A Cross-Language Study of the Authorial Fingerprint Cover Image

The present study addresses one of the theoretical problems of computer-assisted authorship attribution, namely the question which traceable features
Style-Markers in Authorship Attribution A Cross-Language Study of the Authorial Fingerprint

Author(s): Maciej Eder
Subject(s): Language and Literature Studies
Published by: Wydawnictwo Uniwersytetu Jagiellońskiego
Keywords: Kurpian phonology; vowels; Optimality Theory; Polish dialects; Polish phonology

Summary/Abstract: The present study addresses one of the theoretical problems of computer-assisted authorship attribution, namely the question which traceable features of language can betray authorial uniqueness (a stylistic fingerprint) of literary texts. A number of recent approaches show that apart from lexical measures — especially those relying on the frequencies of the most frequent words — also some other features of written language are considerably effective as discriminators of authorial style. However, there have been no attempts to compare the attribution potential of these features. The aim of the present study, then, was to examine the effectiveness of several style-markers in authorship attribution. The style-markers chosen for the empirical investigation are those that can be retrieved from a non-lemmatized corpus of plain text files, such as the most frequent words, word bi-grams, different letter sequences, and markers of different nature, combined in one sample. Equally important, however, was to compare usefulness of the chosen style-markers across a few languages: English, Polish, German, and Latin. The results confirmed a high attribution effectiveness of word-based style-markers in the English corpus, but the alternative markers are shown to be usually more effective in the other languages.

  • Issue Year: 6/2011
  • Issue No: 1
  • Page Range: 99-114
  • Page Count: 16
  • Language: English