Home > Published Issues > 2021 > Volume 12, No. 1, February 2021 >

Comparison of Two Main Approaches for Handling Imbalanced Data in Churn Prediction Problem

Nam N. Nguyen and Anh T. Duong
Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology, Ho Chi Minh City, Vietnam

Abstract—Customer churn is a major problem in several service industries such as banks and telecommunication companies for its profound impact on the company’s revenue. However, the existing algorithms for churn prediction still have some limitations because the data is usually imbalanced. The commonly-used techniques for handling imbalanced data in churn prediction belong to two categories: resampling methods that balance the data before model training, and cost-sensitive learning methods that adjust the relative costs of the errors during model training. In this paper, we compare the performance of two data resampling methods: SMOTE and Deep Belief Network (DBN) against the two cost-sensitive learning methods: focal loss and weighted loss in churn prediction problem. The empirical results show that as for churn prediction problem, the overall predictive performance of focal loss and weighted loss methods is better than that of SMOTE and DBN.
Index Terms—churn prediction, deep belief network, SMOTE, focal loss, weighted loss

Cite: Nam N. Nguyen and Anh T. Duong, "Comparison of Two Main Approaches for Handling Imbalanced Data in Churn Prediction Problem," Journal of Advances in Information Technology, Vol. 12, No. 1, pp. 29-35, February 2021. doi: 10.12720/jait.12.1.29-35

Copyright © 2021 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.