DecisionTree for Classification and Regression: A State-of-the Art Review
DOI:
https://doi.org/10.31449/inf.v44i4.3023Abstract
Classification and regression are defined under the umbrella of the prediction task of data mining. Discrete values are predicted using classification techniques whereas regression techniques are most suitable for predicting continuous data. Analysts from different research areas like data mining, statistics, machine learning, pattern recognition, and big data analytics preferred decision trees over other classifiers as it is simple, effective, efficient, and its performance is competitive with others. In this paper, we review extensively many popularly used state-of-the-artdecision tree-based techniques for classification and regression. We present a survey of more than forty years of research that has been emphasized on the application of decision trees in both classification and regression. This survey could be the potential source for all the researchers who are keenly interested to apply the decision tree classifier/regressor for their research work.References
XindongWu, Xingquan Zhu, Gong-QingWu, and Wei Ding. Data mining with big data. IEEE transactions on knowledge and data engineering, 26(1):97–107, 2013.
Rakesh Agrawal, Tomasz Imielinski, and Arun Swami. Database mining: A performance perspective. IEEE transactions on knowledge and data engineering, 5(6):914–925, 1993.
Satchidananda Dehuri and Ashish Ghosh. Revisiting evolutionary algorithms in feature selection and nonfuzzy/fuzzy rule-based classification. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 3(2):83–108, 2013.
Leszek Rutkowski. Adaptive probabilistic neural networks for pattern classification in time-varying environment. IEEE transactions on neural networks, 15(4):811–827, 2004.
Wouter Verbeke, David Martens, Christophe Mues, and Bart Baesens. Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert systems with applications, 38(3):2354–2364, 2011.
Charu C Aggarwal. Data classification: algorithms and applications. CRC press, 2014.
Salvador García, Alberto Fernández, and Francisco Herrera. Enhancing the effectiveness and interpretability of decision tree and rule induction classifiers with evolutionary training set selection over imbalanced problems. Applied Soft Computing, 9(4):1304–1314, 2009.
Shih-Wei Lin, Kuo-Ching Ying, Chou-Yuan Lee, and Zne-Jung Lee. An intelligent algorithm with feature selection and decision rules applied to anomaly intrusion detection. Applied Soft Computing, 12(10):3285–3290, 2012.
Rodrigo Coelho Barros, Márcio Porto Basgalupp, Andre CPLF De Carvalho, and Alex A Freitas. A survey of evolutionary algorithms for
decision-tree induction. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(3):291–312, 2012.
Lior Rokach and Oded Maimon. Top-down induction of decision trees classifiers-a survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 35(4):476–487, 2005.
Arno De Caigny, Kristof Coussement, and Koen W De Bock. A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees. European Journal of Operational Research, 269(2):760–772, 2018.
Usama M Fayyad and Keki B Irani. On the handling of continuous-valued attributes in decision tree generation. Machine learning, 8(1):87–102, 1992.
Dragi Kocev, Celine Vens, Jan Struyf, and Sašo Džeroski. Ensembles of multi-objective decision trees. In European conference on machine learning, pages 624–631. Springer, 2007.
Dua Dheeru and Efi Karra Taniskidou. UCI machine learning repository, 2017.
Jieyue He, Hae-Jin Hu, Robert Harrison, Phang C Tai, and Yi Pan. Transmembrane segments prediction and understanding using support
vector machine and decision tree. Expert Systems with Applications, 30(1):64–72, 2006.
Jiawei Han, Jian Pei, and Micheline Kamber. Data mining: concepts and techniques. Elsevier, 2011.
Shlomo Geva and Joaquin Sitte. Adaptive nearest neighbor pattern classification. IEEE Transactions on Neural Networks, 2(2):318–322,1991.
Se June Hong. R-mini: An iterative approach for generating minimal rules from examples. IEEE Transactions on Knowledge and Data Engineering, 9(5):709–717, 1997.
Eric WT Ngai, Li Xiu, and Dorothy CK Chau. Application of data mining techniques in customer relationship management: A literature
review and classification. Expert systems with applications, 36(2):2592–2602, 2009.
J Ross Quinlan. Generating production rules from decision trees. In ijcai, volume 87, pages 304–307. Citeseer, 1987.
Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika