The subjective frequency of word n-grams Cover Image

The subjective frequency of word n-grams
The subjective frequency of word n-grams

Author(s): Cyrus Shaoul, Chris F. Westbury, R. Harald Baayen
Subject(s): Language studies, Lexis, Psycholinguistics, Cognitive Psychology, Experimental Pschology
Published by: Društvo psihologa Srbije
Keywords: subjective frequency; n-grams; relative frequency;

Summary/Abstract: When asked to think about the subjective frequency of an n-gram (a group of n words), what properties of the n-gram influence the respondent? It has been recently shown that n-grams that occurred more frequently in a large corpus of English were read faster than n-grams that occurred less frequently (Arnon & Snider, 2010), an effect that is analogous to the frequency effects in word reading and lexical decision. The subjective frequency of words has also been extensively studied and linked to performance on linguistic tasks. We investigated the capacity of people to gauge the absolute and relative frequencies of n-grams. Subjective frequency ratings collected for 352 n-grams showed a strong correlation with corpus frequency, in particular for n-grams with the highest subjective frequency. These n-grams were then paired up and used in a relative frequency decision task (e.g. Is green hills more frequent than weekend trips?). Accuracy on this task was reliably above chance, and the trial-level accuracy was best predicted by a model that included the corpus frequencies of the whole n-grams. A computational model of word recognition (Baayen, Milin, Djurdjevic, Hendrix, & Marelli, 2011) was then used to attempt to simulate subjective frequency ratings, with limited success. Our results suggest that human n-gram frequency intuitions arise from the probabilistic information contained in n-grams.

  • Issue Year: 46/2013
  • Issue No: 4
  • Page Range: 497-537
  • Page Count: 41
  • Language: English