Optimization Techniques for Dealing with Small Dataset for Sentiment Analysis

General Information

ISSN: 1798-2340 (Online)
Frequency: Monthly
DOI: 10.12720/jait
Indexing: ESCI (Web of Science), Scopus, CNKI, EBSCO, etc.
Acceptance Rate: 17%
APC: 1000 USD
Average Days to Accept: 95 days
Managing Editor: Ms. Mia Hu
E-mail: editor@jait.us
Journal Metrics:
Impact Factor 2024: 1.5-Q3; CiteScore 2024: 4.8-Q3

4.8

2024CiteScore

64th percentile

Powered by

Editor-in-Chief

Prof. Kin C. Yow

University of Regina, Saskatchewan, Canada

I'm delighted to serve as the Editor-in-Chief of Journal of Advances in Information Technology. JAIT is intended to reflect new directions of research and report latest advances in information technology. I will do my best to increase the prestige of the journal.

What's New

2025-08-26

Papers published in JAIT Vol.16, Nos. 6&7 have been indexed by Scopus.

2025-08-26

JAIT Vol. 16, No. 8 has been published online!

2025-06-18

Recent News! JAIT JIF 2024: 1.5; Q3 in category of Computer Science, Information Systems

Home > Published Issues > 2025 > Volume 16, No. 9, 2025 >

JAIT 2025 Vol.16(9): 1226-1235
doi: 10.12720/jait.16.9.1226-1235

Isfaque AL Kaderi Tuhin *, Zhengkui Wang, Xiaorong Li, and Wei Zhang

Information and Communications Technology, Singapore Institute of Technology, Singapore, Singapore
Email: tuhin.kaderi@singaporetech.edu.sg (I.A.K.T.); zhengkui.wang@singaporetech.edu.sg (Z.W.); xiaorong.li@singaporetech.edu.sg (X.L.); wei.zhang@singaporetech.edu.sg (W.Z.)
*Corresponding author

Manuscript received February 3, 2025; revised March 4, 2025; accepted May 29, 2025; published September 5, 2025.

Abstract—Sentiment analysis is crucial for many organizations, including those in the transportation industry which use it to gain insights into current issues and improve services provided by public transport operators. However, industries such as transportation face difficulties in fully utilizing AI tools due to the lack of annotated, domain-specific datasets. This scarcity often stems from challenges such as the sensitive nature of the data and a shortage of manpower dedicated to data annotation. Although many sentiment analysis technologies exist, including state-of-the-art transformer-based models, typically require access to large, annotated datasets. This creates a gap in solutions for scenarios characterized by limited and imbalanced data. Our research aims to address this gap by systematically exploring strategies for optimizing sentiment analysis with small, imbalanced datasets for multi-class sentiment classification tasks. We consider constraints posed by data privacy and resource limitations, proposing methodologies that enhance sentiment analysis accuracy without the need for large datasets or extensive annotation efforts. Using RoBERTa, a transformer-based pre-trained model designed for sentiment analysis, and a combination of optimization and data augmentation techniques, we aim to extend the capabilities of sentiment analysis models to perform effectively in data-sparse situations. Our approach addresses the challenges of small datasets and contributes to the broader field of sentiment analysis by offering scalable solutions that can be adapted to various domain-specific environments. Our experimentation has achieved significant improvements in prediction accuracy, demonstrating the feasibility and effectiveness of our approach. By integrating theoretical insights with practical applications, our study sheds light on the untapped potential of small datasets in sentiment analysis. It provides a roadmap for leveraging advanced optimization techniques and inn

Keywords—Natural Language Processing (NLP), sentiment analysis, small data, imbalance data, transformers

Cite: Isfaque AL Kaderi Tuhin, Zhengkui Wang, Xiaorong Li, and Wei Zhang, "Optimization Techniques for Dealing with Small Dataset for Sentiment Analysis," Journal of Advances in Information Technology, Vol. 16, No. 9, pp. 1226-1235, 2025. doi: 10.12720/jait.16.9.1226-1235

Copyright © 2025 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Click to download

PREVIOUS PAPER

Automated Sleep Apnea Detection Using CNNs: Insights into the Impact of FFT Feature Extraction on EEG Signals

NEXT PAPER

Steganalysis in the Spatial Domain: Improving VGG19 Performance Using Particle Swarm Optimization Algorithm

Home

Author Guide

Editor Guide

Reviewer Guide

Published Issues

Special Issue

Sections and Topics

journal menu