Multiword expression tagging of Spanish native and non-native speakers' written essays in a grammar and composition developmental course Cover Image

Etiquetaje de expresiones multipalabra en ensayos escritos por nativos y no nativos de español en un curso de desarrollo de gramática y composición
Multiword expression tagging of Spanish native and non-native speakers' written essays in a grammar and composition developmental course

Author(s): Miguel Da Corte, Jorge Baptista
Subject(s): Language studies, Language and Literature Studies, Foreign languages learning, Applied Linguistics
Published by: Univerzita Palackého v Olomouci
Keywords: multiword expressions; language proficiency; classification level; machinelearningmodels; developmental education courses (in Spanish)

Summary/Abstract: The literature on second language learning posits that there are significant differences between the use of multiword expressions (MWE) by native speakers (NS) and non-native speakers (NNS). Furthermore, it considers that levels of language proficiency can be estimated on the basis of the use of these expressions. This paper analyses the written production from a corpus of essays written by native (16 essays, 5839 words) and non-native Spanish speakers (25 essays, 7767 words) enrolled in a course focused on the development of orthographic, grammatical, lexical, semantic, and discursive skills in Spanish. This is a required course for students pursuing a certification in Translating or Interpreting (Spanish/English) in the educational setting where the study took place. The corpus was manually tagged by two linguists. The classification scheme used was inspired by other schemes found in the literature and built for similar purposes. The results show that, in general, the distribution of MWE types found in the NS and NNS partition of the corpus was not very different (Pearson correlation: 0.894). However, interesting differences were found between the categories of verbal idioms and noun constructions. Though the corpus is too small for more significant conclusions to be drawn, it is possible to point out that different types of MWE are unevenly distributed among the native speakers’ and non-native learners’ written production material, and some categories may be a clearer indicator of near-native-speaker proficiency.

  • Issue Year: 35/2023
  • Issue No: 1
  • Page Range: 23-40
  • Page Count: 18
  • Language: Spanish