Building WITS: Challenges in creating a corpus of workplace sitcoms
Building WITS: Challenges in creating a corpus of workplace sitcoms
Author(s): Karla CsürösSubject(s): Social Sciences, Language and Literature Studies, Media studies, Applied Linguistics, Communication studies
Published by: Editura Universităţii de Vest din Timişoara
Keywords: American television; comedy series; corpus design; telecinematic discourse; workplace sitcom;
Summary/Abstract: This article introduces the first version of the Workplace Sitcoms Corpus (WITS), consisting of over 2 million words of dialogue lines extracted from eight contemporary American workplace sitcoms: Ally McBeal (1997-2002), Scrubs (2001-2010), The Office (2005-2013), 30 Rock (2006-2013), Parks and Recreation (2009-2015), Brooklyn Nine-Nine (2013-2021), Superstore (2015-2021) and Abbott Elementary (2021-ongoing). The corpus aims to be representative of a specific subgenre of TV dialogue, with a focus on work-related discourses in fictional contexts. Following Bednarek’s (2018) definition, television dialogue is understood as all linguistic utterances made by actors performing characters on TV series. The WITS corpus is presented in relation to other reference television corpora, such as SydTV (Bednarek, 2018) and the TV Corpus (Davies 2021). This paper details the corpus design stage, focusing on various selection criteria of titles that are categorized as workplace sitcoms, as well as the data collection and initial transcription stages of building the WITS corpus. It also highlights some of the key challenges in gathering a large amount of television dialogue data: difficulty in finding sources and means of assuring their accuracy, adjustable data cleaning techniques, proper annotation methods and others. Notably, in line with Quaglio’s (2008) findings, I argue that fan transcriptions are the most reliable source of data when it comes to television dialogue, with some noteworthy caveats that will impact my further research.
Journal: Analele Universităţii de Vest din Timişoara.Seria ştiinţe filologice
- Issue Year: 1/2024
- Issue No: 1
- Page Range: 43-57
- Page Count: 15
- Language: English
