Home > Published Issues > 2026 > Volume 17, No. 4, 2026 >
JAIT 2026 Vol.17(4): 777-787
doi: 10.12720/jait.17.4.777-787

Machine Learning Approaches for Imbalanced Post-Disaster Building Damage Classification: A Case Study of the 2022 Cianjur Earthquake

Eka Rahmawati 1,2,*, Catur Edi Widodo 3, and Sorja Koesuma 4
1. Doctoral Program of Information Systems, School of Postgraduate Studies, Diponegoro University, Semarang, Indonesia
2. Information Systems Study Program, Faculty of Engineering and Informatics, Bina Sarana Informatika University, Jakarta, Indonesia
3. Department of Physics, Faculty of Science and Mathematics, Diponegoro University, Semarang, Indonesia
4. Department of Physics, Faculty of Mathematics and Natural Sciences, Sebelas Maret University, Surakarta, Indonesia
Email: eka.eat@bsi.ac.id (E.R.); caturediwidodo@lecturer.undip.ac.id (C.E.W.); sorja@staff.uns.ac.id (S.K.)
*Corresponding author

Manuscript received November 11, 2025; revised December 31, 2025; accepted January 28, 2026; published April 24, 2026.

Abstract—Quick and accurate assessment of building damage after a disaster is essential for effective recovery and reconstruction planning. While traditional field surveys are reliable, they are often time-consuming, resource-intensive, and can be affected by human bias. This study proposes a comparative evaluation of eight machine learning algorithms: LightGBM, Extreme Gradient Boosting (XGBoost), Synthetic Minority Oversampling Technique (SMOTE) + Random Forest (RF), Balanced Random Forest (BRF), Random Forest, Logistic Regression, Support Vector Machine (SVM) (Radial Basis Function (RBF)), and Naive Bayes, for multiclass classification of post-earthquake building damage utilizing real-world data on the 2022 Cianjur earthquake. The dataset comprises 3,063 records detailing geospatial and environmental characteristics, comprising epicentral distance, elevation, slope type, soil type, and vulnerability zone. To ensure reliable model performance on imbalanced data, a 10-fold cross-validation strategy was applied, evaluated across five metrics: accuracy, precision, recall, F1-Score, and Area Under the ROC Curve (AUC). The results indicate that ensemble-based models, particularly XGBoost (AUC = 0.860), LightGBM (AUC = 0.858), and BRF (AUC = 0.857), outperform traditional classifiers in both predictive accuracy and sensitivity toward severely damaged buildings. The outcomes underscore the effectiveness of gradient boosting and balanced ensemble methods in managing complex, imbalanced datasets, highlighting their potential for integration into automated post-disaster damage assessment systems. This research contributes to the development of intelligent, data-driven decision-support tools that can help accelerate recovery efforts in earthquake-prone regions.
 
Keywords—post-disaster damage assessment, machine learning, imbalanced classification, Extreme Gradient Boosting (XGBoost), LightGBM, Random Forest (RF), Support Vector Machine (SVM)

Cite: Eka Rahmawati, Catur Edi Widodo, and Sorja Koesuma, "Machine Learning Approaches for Imbalanced Post-Disaster Building Damage Classification: A Case Study of the 2022 Cianjur Earthquake," Journal of Advances in Information Technology, Vol. 17, No. 4, pp. 777-787, 2026. doi: 10.12720/jait.17.4.777-787

Copyright © 2026 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Article Metrics in Dimensions