Home > Published Issues > 2024 > Volume 15, No. 4, 2024 >
JAIT 2024 Vol.15(4): 492-499
doi: 10.12720/jait.15.4.492-499

Enhancing Sentiment Analysis Accuracy in Borobudur Temple Visitor Reviews through Semi-Supervised Learning and SMOTE Upsampling

Candra Agustina 1,*, Purwanto Purwanto 2, and Farikhin Farikhin 3
1. Doctoral Program of Information System, School of Postgraduate Studies,
Diponegoro University, Semarang, Indonesia
2. Department of Chemical Engineering, Diponegoro University, Semarang, Indonesia
3. Department of Mathematics, Faculty of Science and Mathematics, Diponegoro University, Semarang, Indonesia
Email: candra.caa@bsi.ac.id (C.A.); purwanto@live.undip.ac.id (P.P.); farikhin.math.undip@gmail.com (F.F.)
*Corresponding author

Manuscript received October 25, 2023; revised December 13, 2023; accepted December 25, 2023; published April 9, 2024.

Abstract—The level of visitor satisfaction with tourist destinations can be known from reviews on social media. One method used is to carry out sentiment analysis on comments given by visitors on social media or related websites. This study was envisioned as a preliminary phase to bolster subsequent research concerning tourist destination recommendation systems around Borobudur Temple. We conducted a sentiment analysis using a semi-supervised learning approach. Within this approach, the dataset was partitioned into labeled and unlabeled data. The labeled data served as a reference for the automatic labeling process, which utilized the Multinomial Naïve Bayes algorithm. Specifically, the objective was to extract sentiments from visitors to Borobudur Temple. These extracted sentiments will later be employed as a variable in subsequent research. Dataset preprocessing steps encompassed data cleaning, sentence segmentation, tokenization, and stop word removal. We observed that the difference in labeling outcomes between datasets trained without Synthetic Minority Oversampling Technique (SMOTE) Upsampling and those trained with SMOTE Upsampling was a mere 0.18%. The labeled data not only plays a pivotal role in model training but is also instrumental in evaluating the accuracy of the Multinomial Naïve Bayes algorithm. Crucially, after implementing the SMOTE Upsampling technique, our model exhibited a significant improvement, achieving an accuracy rate of 83.68%. This noteworthy enhancement represents a substantial increase from the initial accuracy rate of 60.59%. Our in-depth analysis underscores the superior performance achieved when the training data undergo the SMOTE Upsampling process, indicating the effectiveness of this approach in refining sentiment analysis outcomes for tourist reviews.
Keywords—Synthetic Minority Oversampling Technique (SMOTE) Upsampling, analysis sentiment, tourism, semi-supervised learning

Cite: Candra Agustina, Purwanto Purwanto, and Farikhin Farikhin, "Enhancing Sentiment Analysis Accuracy in Borobudur Temple Visitor Reviews through Semi-Supervised Learning and SMOTE Upsampling," Journal of Advances in Information Technology, Vol. 15, No. 4, pp. 492-499, 2024.

Copyright © 2024 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.