Challenges and Promises of Large Textual Data Sets’ Digital Analysis in Social Research: The Case of Preparing a Corpus of EU Documents on Poverty and Social Exclusion from 2001 to 2021 Cover Image

Wyzwania i potencjał analiz cyfrowych dużych zbiorów danych tekstowych w badaniach społecznych na przykładzie przygotowania korpusu dokumentów unijnych dotyczących ubóstwa i wykluczenia społecznego z lat 2001-2021
Challenges and Promises of Large Textual Data Sets’ Digital Analysis in Social Research: The Case of Preparing a Corpus of EU Documents on Poverty and Social Exclusion from 2001 to 2021

Author(s): Marianna Zieleńska, Daniel Płatek, Patryk Hubar-Kołodziejczyk, Agnieszka Karlińska
Subject(s): Social Sciences, Sociology
Published by: Instytut Filozofii i Socjologii Polskiej Akademii Nauk
Keywords: network analysis; EU document corpus; FAIR; Link Open Data; governance by indicators; EU antipoverty policy

Summary/Abstract: The article explores the challenges promises of using digital methods to analyse large text datasets in the social sciences, focusing on the EU policies addressing poverty and social exclusion between 2001 and 2021. It outlines the creation of a corpus of EU documents, from acquiring data via EUR-Lex to processing, enriching metadata, deduplication and network analysis. Indicators are seen as elements of the EU’s calculative infrastructure, which not only describe social reality but also shape it by framing problems and guiding policy interventions. Network analysis and visualisation reveal shifts in the relationships between indicators, topics and institutions that are hard to capture with traditional methods, while interpretation draws on the interviews and literature review. Findings show that at- risk-of-poverty rate (AROP) indicator is central to EU policy, while the newer indicator (AROPE) is less present beyond administrative structures. The growing involvement of agencies, research institutes, consultancy firms and academia reflects the externalisation of expertise. Introducing new indicators extends the EU’s influence into broader areas of social policy. The authors propose a research workflow aligned with the FAIR and Linked Open Data principles, highlighting the value of combining digital and qualitative methods and fostering interdisciplinary collaboration to better understand political processes.

  • Issue Year: 258/2025
  • Issue No: 3
  • Page Range: 73-104
  • Page Count: 32
  • Language: Polish
Toggle Accessibility Mode