Home > Published Issues > 2024 > Volume 15, No. 4, 2024 >
JAIT 2024 Vol.15(4): 519-531
doi: 10.12720/jait.15.4.519-531

Development and Comparison of Multiple Emotion Classification Models in Indonesia Text Using Machine Learning

Ahmad Zamsuri 1,*, Sarjon Defit 2, and Gunadi Widi Nurcahyo 2
1. Department of Informatics Engineering, Faculty of Computer Science,
University of Lancang Kuning, Pekanbaru, Indonesia
2. Department of Information Technology, Faculty of Computer Science,
University of Putra Indonesia YPTK, Padang, Indonesia
Email: ahmadzamsuri@unilak.ac.id (A.Z.); Sarjon_defit@upiyptk.ac.id (S.D.); gunadiwidi@yahoo.co.id (G.W.N.)
*Corresponding author

Manuscript received October 3, 2023; revised November 11, 2023; accepted December 13, 2023; published April 24, 2024.

Abstract—Emotion is an individual’s response to an event or situation. Research related to emotions in the field of data science falls under sentiment analysis, where sentiment analysis mostly focuses on determining the positive or negative emotional tone. This study attempts to classify emotions into labels such as Happy, Love, Surprise, Anger, Fear, and Sadness. Then, these six emotion labels are further categorized into positive and negative groups. The dataset for this research comprises tweets from Twitter related to the 2024 presidential election in Indonesia. Several machine learning algorithms are employed in this study, including Naïve Bayes (Multinomial Bayes, Bernoulli Bayes, Complement Bayes), K-Nearest Neighbors (KNN), and Support Vector Machines (SVM), comparing two feature extraction methods: Term Frequency-Inverse Document Frequency (TF-IDF) and Bag of Words (BoW). The results show that the use of SVM does not yield better accuracy, whether using TF-IDF or BoW. Therefore, this study improves the accuracy of the SVM algorithm by combining kernels available in SVM (Polynomial, Radial Basis Function (RBF), and Linear). From the conducted experiments, it is evident that SVM with a combined kernel, referred to as SVM Polynomial, RBF, Linier (PoRLi) in this study, achieves better accuracy compared to a single kernel. This is reflected in the classification accuracy with 6 labels, reaching 62%, indicating a 2% increase from SVM Linear, which obtained the highest accuracy using a single kernel. Furthermore, in the classification of 2 labels, there is also a 1% increase when using SVM PoRLi. It can be concluded that SVM PoRLi can enhance accuracy across various labels.
 
Keywords—Term Frequency-Inverse Document Frequency (TF-IDF), Bag of Words (BoW), machine learning, multiple emotion, Support Vector Machines (SVM), Radial Basis Function (RBF), Polynomial, RBF, Linier (PoRLi), SVM PoRLi

Cite: Ahmad Zamsuri, Sarjon Defit, and Gunadi Widi Nurcahyo, "Development and Comparison of Multiple Emotion Classification Models in Indonesia Text Using Machine Learning," Journal of Advances in Information Technology, Vol. 15, No. 4, pp. 519-531, 2024.

Copyright © 2024 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.