COMPARATIVE ANALYSIS OF TOPIC MODELING AND LARGE LANGUAGE MODELS IN EXTRACTING INSIGHTS FROM SOCIAL MEDIA CONTENT
COMPARATIVE ANALYSIS OF TOPIC MODELING AND LARGE LANGUAGE MODELS IN EXTRACTING INSIGHTS FROM SOCIAL MEDIA CONTENT
Author(s): Vitali ChaikoSubject(s): Politics / Political Sciences, Economy, Media studies, Business Economy / Management, Communication studies, Theory of Communication, Human Resources in Economy, ICT Information and Communications Technologies
Published by: Университет по библиотекознание и информационни технологии
Keywords: Topic modelling; Prompt engineering; ChatGPT; Bard; Twitter
Summary/Abstract: The exponential growth of textual data, driven by communication technologies, presents a challenge in extracting valuable insights. This study focuses on evaluating the effectiveness of topic modeling algorithms BERTopic and Top2Vec compared to Large Language Models (LLMs), especially ChatGPT and Google Bard in the context of social media analysis. Specifically, it investigates the capability of these models in extracting topics from a corpus of Twitter data regarding the company Amazon. The methodology involves preprocessing a subset of more than 720,000 tweets, followed by topic model training and prompt engineering for LLMs. The study develops quantitative metrics for comparison of the topic extraction capabilities of the models. Initial results indicate a disparity in the performance of topic models and LLMs, with LLMs demonstrating human intuitive topic extraction, but exhibiting only 15% of similarity in exact topic words and 23% of similarity in word embeddings compared to topic models.
Journal: Образование, научни изследвания и иновации
- Issue Year: II/2024
- Issue No: 2
- Page Range: 42-49
- Page Count: 8
- Language: English
