Integrating Information Gain methods for Feature Selection in Distance Education Sentiment Analysis during Covid-19

Integrating Information Gain methods for Feature Selection in Distance Education Sentiment Analysis during Covid-19
Integrating Information Gain methods for Feature Selection in Distance Education Sentiment Analysis during Covid-19

Author(s): Syamsu Rijal, Pandu Adi Cakranegara, Eka Maya S.S. Ciptaningsih, Putri Hana Pebriana, A Andiyan, Robbi Rahim
Subject(s): ICT Information and Communications Technologies, Distance learning / e-learning
Published by: UIKTEN - Association for Information Communication Technology Education and Science
Keywords: Sentiment Analysis; Random Forest; C4.5; Decision Tree; Twitter

Summary/Abstract: The disparities in access to public assistances between rural and urban areas are analyzed in this study, corresponding to the intensity in which the constructs are related to expected quality, perceived quality and perceived value as influential factors in citizen satisfaction and loyalty. In Guayaquil, a survey is carried out at the household level, 428 valid questionnaires are obtained in the rural area of Tenguel and 521 Valid questionnaires in the urban area of Tarqui, applyingto the American Customer Satisfaction Index (ACSI). The research used a Structural Equation model (SEM) to evaluate the hypotheses raised, if they observe significant differences in citizen perception between inhabitants of urban and rural areas about the quality of community services, this as a determining cause in the level of citizen satisfaction and loyalty to choose their municipal authorities. The multigroup analysis allowed to identify inequalities in the observation of the quality of municipal or local communityassistances between rural and urban areas; the findings are considered to local public administrators for the design of public policy aimed at improving levels of citizen satisfaction and loyalty.Sentiment analysis is a way to automatically understand and process text data to figure out how someone feels about an opinion sentence. If there are too many reviews, it will take a lot of time and they will start to be biased. Sentiment classification tries to solve this problem by putting user reviews into groups based on whether they are positive, negative, or neutral. The dataset comes from Drone Emprit Academic. It is made up of tweets with the words "online learning method" in them, with as many as 4887 data crawled from them. Information Gain and adaboost on the C4.5 (FS+C4.5) method are used in the feature selection method. We use feature options to get rid of bias and improve accuracy. The results of the experiments will be compared to other algorithms like C4.5 and random forest. Based on the results, the accuracy of the two standard decision tree models (C4.5 and random forest) went up from 48.21% and 50.35% to 94.47 %. The value of how accurate it was went up by 44 percent. The FS+C4.5 model, on the other hand, has an RMSE of 0.204 and a correlation of 0.944. So, adding the feature selection technique to the sentiment analysis of bold learning education can make the C4.5 algorithm even more accurate.

Details
Contents

Journal: TEM Journal

Issue Year: 12/2023
Issue No: 1
Page Range: 285-290
Page Count: 6
Language: English

Content File-PDF

Back to list