DecisionTree for Classification and Regression: A State-of-the Art Review

Authors

  • Monalisa Jena DEPT. OF ICT, FAKIR MOHAN UNIVERSITY, VYASA VIHAR, BALASORE, ODISHA, INDIA-756019
  • Satchidananda Dehuri DEPT. OF ICT, FAKIR MOHAN UNIVERSITY, VYASA VIHAR, BALASORE, ODISHA, INDIA-756019

DOI:

https://doi.org/10.31449/inf.v44i4.3023

Abstract

Classification and regression are defined under the umbrella of the prediction task of data mining. Discrete values are predicted using classification techniques whereas regression techniques are most suitable for predicting continuous data. Analysts from different research areas like data mining, statistics, machine learning, pattern recognition, and big data analytics preferred decision trees over other classifiers as it is simple, effective, efficient, and its performance is competitive with others. In this paper, we review extensively many popularly used state-of-the-artdecision tree-based techniques for classification and regression. We present a survey of more than forty years of research that has been emphasized on the application of decision trees in both classification and regression. This survey could be the potential source for all the researchers who are keenly interested to apply the decision tree classifier/regressor for their research work.

Author Biographies

Monalisa Jena, DEPT. OF ICT, FAKIR MOHAN UNIVERSITY, VYASA VIHAR, BALASORE, ODISHA, INDIA-756019

Monalisa Jena is working as an Assistant Professor in the Department of Information and Communication Technology, Fakir Mohan University, Balasore, Odisha, India since 2015. Before that she has worked in Government Polytechnic, Balasore, Odisha as Lecturer in Computer Application under Skill Development in Technical Education Department, Odisha, India from 20.11.2013 to 18.12.2015 (two years), appointed by Odisha Public Service Commission (OPSC). She qualified UGC-NET for lectureship in the year 2012. She has received her M.Tech degree from Siksha O Anusandhan University, Bhubaneswar. She is continuing her Ph.D. work since 2017 with specialization “Data mining and Soft Computing” in Fakir Mohan University, Balasore. In addition to this, her area of research interests includes Big Data Analysis, Machine Learning, Social Networking, Wireless Mesh Networks, etc. She has published one Journal paper, 7 conference papers, one book chapter accepted in CRC Press and communicated one journal paper in one Scopus indexed journal.

Satchidananda Dehuri, DEPT. OF ICT, FAKIR MOHAN UNIVERSITY, VYASA VIHAR, BALASORE, ODISHA, INDIA-756019

Satchidananda Dehuri is working as a Professor in the Department of Information and Communication Technology, Fakir Mohan University, Balasore, Odisha, India since 2013. He received his M.Tech. and Ph.D. degrees in Computer Science from Utkal University, Vani Vihar, Odisha in 2001 and 2006, respectively. He visited as a BOYSCAST Fellow to the Soft Computing Laboratory, Yonsei University, Seoul, South Korea under the BOYSCAST Fellowship Program of DST, Govt. of India in 2008. In 2010 he received Young Scientist Award in Engineering and Technology for the year 2008 fromOdishaVigyanAcademy, Department of Science and Technology, Govt. of Odisha. He was at the Center for Theoretical Studies, Indian Institute of Technology Kharagpur as a Visiting Scholar in 2002. During May-June 2006 he was a Visiting Scientist at the Center for Soft Computing Research, Indian Statistical Institute, Kolkata. His research interests include Evolutionary Computation, Neural Networks, Pattern Recognition, Data Warehousing and Mining, Object-Oriented Programming and its Applications and Bioinformatics. He has already published about 200 research papers in reputed journals and referred conferences, has published five textbooks for undergraduate and Post graduate students and edited more than ten books of contemporary relevance. Under his direct supervision, 10 Ph.D. scholars have been successfully awarded, three scholars have submitted their thesis, and eight more are pursuing their Ph.D. work. In addition, he has successfully guided two post doctoral scholars during the stay atAjou University, South Korea as an Associate Professor in the Department of System Engineering for 02 years. He has completed three different research projects obtained from DST, UGC, and DRDO. His h-index is more than 20. As a part of Academic Collaboration, he has visited Ireland, New Zealand, Hong Kong, France, and Nepal.

References

XindongWu, Xingquan Zhu, Gong-QingWu, and Wei Ding. Data mining with big data. IEEE transactions on knowledge and data engineering, 26(1):97–107, 2013.

Rakesh Agrawal, Tomasz Imielinski, and Arun Swami. Database mining: A performance perspective. IEEE transactions on knowledge and data engineering, 5(6):914–925, 1993.

Satchidananda Dehuri and Ashish Ghosh. Revisiting evolutionary algorithms in feature selection and nonfuzzy/fuzzy rule-based classification. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 3(2):83–108, 2013.

Leszek Rutkowski. Adaptive probabilistic neural networks for pattern classification in time-varying environment. IEEE transactions on neural networks, 15(4):811–827, 2004.

Wouter Verbeke, David Martens, Christophe Mues, and Bart Baesens. Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert systems with applications, 38(3):2354–2364, 2011.

Charu C Aggarwal. Data classification: algorithms and applications. CRC press, 2014.

Salvador García, Alberto Fernández, and Francisco Herrera. Enhancing the effectiveness and interpretability of decision tree and rule induction classifiers with evolutionary training set selection over imbalanced problems. Applied Soft Computing, 9(4):1304–1314, 2009.

Shih-Wei Lin, Kuo-Ching Ying, Chou-Yuan Lee, and Zne-Jung Lee. An intelligent algorithm with feature selection and decision rules applied to anomaly intrusion detection. Applied Soft Computing, 12(10):3285–3290, 2012.

Rodrigo Coelho Barros, Márcio Porto Basgalupp, Andre CPLF De Carvalho, and Alex A Freitas. A survey of evolutionary algorithms for

decision-tree induction. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(3):291–312, 2012.

Lior Rokach and Oded Maimon. Top-down induction of decision trees classifiers-a survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 35(4):476–487, 2005.

Arno De Caigny, Kristof Coussement, and Koen W De Bock. A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees. European Journal of Operational Research, 269(2):760–772, 2018.

Usama M Fayyad and Keki B Irani. On the handling of continuous-valued attributes in decision tree generation. Machine learning, 8(1):87–102, 1992.

Dragi Kocev, Celine Vens, Jan Struyf, and Sašo Džeroski. Ensembles of multi-objective decision trees. In European conference on machine learning, pages 624–631. Springer, 2007.

Dua Dheeru and Efi Karra Taniskidou. UCI machine learning repository, 2017.

Jieyue He, Hae-Jin Hu, Robert Harrison, Phang C Tai, and Yi Pan. Transmembrane segments prediction and understanding using support

vector machine and decision tree. Expert Systems with Applications, 30(1):64–72, 2006.

Jiawei Han, Jian Pei, and Micheline Kamber. Data mining: concepts and techniques. Elsevier, 2011.

Shlomo Geva and Joaquin Sitte. Adaptive nearest neighbor pattern classification. IEEE Transactions on Neural Networks, 2(2):318–322,1991.

Se June Hong. R-mini: An iterative approach for generating minimal rules from examples. IEEE Transactions on Knowledge and Data Engineering, 9(5):709–717, 1997.

Eric WT Ngai, Li Xiu, and Dorothy CK Chau. Application of data mining techniques in customer relationship management: A literature

review and classification. Expert systems with applications, 36(2):2592–2602, 2009.

J Ross Quinlan. Generating production rules from decision trees. In ijcai, volume 87, pages 304–307. Citeseer, 1987.

Downloads

Published

2020-12-15

How to Cite

Jena, M., & Dehuri, S. (2020). DecisionTree for Classification and Regression: A State-of-the Art Review. Informatica, 44(4). https://doi.org/10.31449/inf.v44i4.3023

Issue

Section

Overview papers