Analysis Implementation of the Ensemble Algorithm in Predicting Customer Churn in Telco Data: A Comparative Study
DOI:
https://doi.org/10.31449/inf.v47i7.4797Abstract
Globalization and technological advancements in the telecommunication industry have led to a significant rise in the number of operators, leading to intense market competition. This sector has become crucial in developed countries, and companies strive to increase profits by acquiring new customers, up-selling existing ones, and extending the retention period of current clients. In the traditional method of defect prediction, a single classifier is used to build a model on a pre-labeled dataset. However, this approach has limitations in predicting defects accurately under certain circumstances. To overcome these limitations, boosting is applied to combine multiple weak classifiers and create a robust classification model. Among many algorithms used for churn prediction, ensemble techniques have demonstrated greater accuracy than simpler approaches. This study aims to overcome these limitations by experimenting with five ensemble algorithms, including Adaboost, Gradient Boost, XGBoost, CatBoost, and LightGBM. The results indicate that XGBoost outperforms other techniques and is the most suitable algorithm to build the predictive model. Additionally, the study achieves higher accuracy by performing a Grid Search CV hyper-parameter setting with XGBoost, resulting in an accuracy of 81.2%.References
Chen, H., Chiang, R.H., Storey, V.C. Business intelligence and analytics: From big data to big impact. MIS quarterly.2012;1165–1188
Ullah I, Raza B, Malik AK, Imran M, Islam SU, Kim SW.A churn prediction model using random forest: analysis of machine learning techniques for churn prediction and factor identification in telecom sector. IEEE Access 2019;(7):60134–60149 DOI 10.1109/ACCESS.2019.2914999
Labhsetwar, S. R,.Predictive analysis of customer churn in telecom industry using supervised learning. ICTACT Journal on Soft Computing,2020;10(2), 2054-2060
Rajamohamed R, Manokaran J.Improved credit card churn prediction based on rough clustering and supervised learning techniques. Cluster Computing 21. 2018 ;(1):65–77
Lalwani, P., Mishra, M. K., Chadha, J. S., & Sethi, P.Customer churn prediction system: a machine learning approach. Computing. 2021;DOI:10.1007/s00607-021-00908-y
Amin, A., Al-Obeidat, F., Shah, B., Adnan, A., Loo, J., & Anwar, S.Customer churn prediction in telecommunication industry using data certainty. Journal of Business Research. 2018; doi:10.1016/j.jbusres.2018.03.003
Vijaya J, Sivasankar E. Improved churn prediction based on supervised and unsupervised hybrid data mining system. In: Information and Communication Technology for Sustainable Development. Singapore: Springer, 2018; 485–499
Ali M, Rehman AU, Hafeez S, Ashraf MU. Prediction of churning behavior of customers in telecom sector using supervised learning techniques. In: International Conference on Computer, Control, Electrical, and Electronics Engineering (ICCCEEE). Piscataway: IEEE, 2018; 1–6
Amin A, Al-Obeidat F, Shah B, Adnan A, Loo J, Anwar S. Customer churn prediction in telecommunication industry using data certainty. Journal of Business Research 94.2019;(8):290–301 DOI 10.1016/j.jbusres.2018.03.003.
Matloob, F., Ghazal, T. M., Taleb, N., Aftab, S., Ahmad, M., Khan, M. A., … Soomro, T. R. Software Defect Prediction Using Ensemble Learning: A Systematic Literature Review. IEEE Access, 9, 2021; 98754–98771. doi:10.1109/access.2021.3095559
Bilal, S. F., Almazroi, A. A., Bashir, S., Khan, F. H., & Almazroi, A. A. An ensemble based approach using a combination of clustering and classification algorithms to enhance customer churn prediction in telecom industry. PeerJ Computer Science, 2022; 8, e854
A. N. R. Moparthi and B. D. N. Geethanjali.Design and implementation of hybrid phase based ensemble technique for defect discovery using SDLC software metrics. in Proc. 2nd Int. Conf. Adv. Electr., Electron., Inf., Commun. Bio-Inform. (AEEICB), Feb. 2016, pp. 268–274
Ahmed M, Afzal H, Siddiqi I, Amjad MF, Khurshid K. Exploring nested ensemble learners using overproduction and choose approach for churn prediction in telecom industry. Neural Computing and Applications . 2020;32(8):3237–3251 DOI 10.1007/s00521-018-3678-8
Brownlow J, Chu C, Fu B, Xu G, Culbert B, Meng Q. Cost-sensitive churn prediction in fund management services. In: International Conference on Database Systems for Advanced Applications. Cham: Springer, 2018;776–788
S. Jhaveri, I. Khedkar, Y. Kantharia and S. Jaswal, "Success Prediction using Random Forest, CatBoost, XGBoost and AdaBoost for Kickstarter Campaigns," 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 2019; 1170-3
Dong, X., Yu, Z., Cao, W., Shi, Y., & Ma, Q. . A survey on ensemble learning. Frontiers of Computer Science, 2019; 14(2):241–258. doi:10.1007/s11704-019-8208-z
V. Umayaparvathi and K. Iyakutti, "Attribute selection and Customer Churn Prediction in the telecom industry," in International Conference on Data Mining and Advanced Computing (SAPIENCE), Ernakulam, 2016; 84-90.
O. Celik and U. O. Osmanoglu.Comparing to Techniques Used in Customer Churn Analysis.J. Multidiscip. Dev. 2019;4(1):30–38.
Deng, Y., Li, D., Yang, L., Tang, J., & Zhao, J. Analysis and prediction of bank user churn based on ensemble learning algorithm. 2021 IEEE International Conference on Power Electronics, Computer Applications (ICPECA). 2021; doi:10.1109/icpeca51329.2021.93620
Thakkar, H. K., Desai, A., Ghosh, S., Singh, P., & Sharma, G. Clairvoyant: AdaBoost with cost-enabled cost-sensitive classifier for customer churn prediction. Computational Intelligence and Neuroscience. 2022.
Ahmad, A. K., Jafar, A., & Aljoumaa, K. Customer churn prediction in telecom using machine learning in big data platform. Journal of Big Data, 2019 6(1). doi:10.1186/s40537-019-0191-6
S. Raschka. Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning. 2018. http://arxiv.org/abs/1811.12808.
M. A. Alonso, D. Vilares, C. Gómez-Rodríguez & J. Vilares.Sentiment analysis for fake news detection. Electronics, 2021;10(11). https://doi.org/10.3390/electronics10111348.
Senthan, P., Rathnayaka, R., Kuhaneswaran, B., & Kumara, B. Development of Churn Prediction Model using XGBoost - Telecommunication Industry in Sri Lanka. 2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS). 2021;doi:10.1109/iemtronics52119.2021.
RB, D. Customer churn prediction in telecommunication industry through machine learning based Fine-tuned XGBoost algorithm.2021
Khamlichi, F.I., Zaim, D., Khalifa, K. A new model based on global hybridization of machine learning techniques for “customer churn prediction”, in: 2019 Third International Conference on Intelligent Computing in Data Sciences (ICDS), IEEE. 2019:1–4.
Shrestha, S. M., & Shakya, A. A Customer Churn Prediction Model using XGBoost for the Telecommunication Industry in Nepal. Procedia Computer Science, 215, 652-661. 2022
Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika