Classification Basedon Decision Tree Algorithm
Classification Basedon Decision Tree Algorithm
net/publication/350386944
CITATIONS READS
89 21,770
2 authors, including:
SEE PROFILE
All content following this page was uploaded by Adnan Mohsin Abdulazeez on 25 March 2021.
1
IT Department, Technical College of Informatics Akre, Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq,
[email protected]
2
Presidency of Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq, [email protected]
*Correspondence: [email protected]
Abstract
Decision tree classifiers are regarded to be a standout of the most well-known methods to data classification representation of classifiers.
Different researchers from various fields and backgrounds have considered the problem of extending a decision tree from available
data, such as machine study, pattern recognition, and statistics. In various fields such as medical disease analysis, text classification,
user smartphone classification, images, and many more the employment of Decision tree classifiers has been proposed in many ways.
This paper provides a detailed approach to the decision trees. Furthermore, paper specifics, such as algorithms/approaches used,
datasets, and outcomes achieved, are evaluated and outlined comprehensively. In addition, all of the approaches analyzed were
discussed to illustrate the themes of the authors and identify the most accurate classifiers. As a result, the uses of different types of
datasets are discussed and their findings are analyzed.
20
doi: 10.38094/jastt20165
Tijo & Abdulazeez / Journal of Applied Science and Technology Trends Vol. 02, No. 01, pp. 20 – 28 (2021)
training process [15 - 17]. The amount of data obtained in data threshold value in each test [36]. The conceptual rules are
mining environments is huge [18 - 20]. If the data set is much easier to construct than the numerical weights in the
properly classified and contains the minimum number of neural network of connections between nodes [37, 38]. Mainly
nodes, then using the decision tree method is optimal [21 - 23]. for grouping purposes, DT is used. Moreover, DT is a usually
utilized classification model in Data Mining [39]. The nodes
A decision tree is a tree-based technique in which any path and branches are composed of each tree. Each node represents
beginning from the root is described by a data separating features in a category to be classified and each subset defines a
sequence until a Boolean outcome at the leaf node is achieved value that can be taken by the node [40, 41]. Because of their
[24 - 27]. It is the hierarchical exemplification of knowledge simple analysis and their precision on multiple data forms,
relationships that contain nodes and connections. When decision trees have found many implementation fields [42].
relations are used to classify, nodes represent purposes [28 - Fig. 2 show an example of DT.
31].
In this paper, a comprehensive review is performed for the
latest and most efficient approaches that have been performed
by researchers in the past three years about decision trees in
different areas of machine learning. Also, the details of this
method, such as using algorithms/approaches, datasets, and the
findings achieved are summarized. In addition, this study
highlighted the most commonly used approaches and the
highest accuracy methods achieved.
The organization of the remaining paper is as follows:
Section II contains the decision tree algorithm mentioning its
types, benefits, and drawbacks; Section III gives a Literature
Review on decision tree Algorithm; Section IV comparison
and discussion on the decision tree, and the last section
contains the conclusion.
II. DECISION TREE ALGORITHM Fig. 2. Example on Decision Tree [43]
One of the widely used techniques in data mining is
systems that create classifiers [32]. In data mining,
classification algorithms are capable of handling a vast volume A. Types of Decision Tree Algorithms
of information. It can be used to make assumptions regarding There are several Types of DT algorithms such as: Iterative
categorical class names, to classify knowledge on the basis of Dichotomies 3 (ID3), Successor of ID3 (C4.5), Classification
training sets and class labels, and to classify newly obtainable And Regression Tree(CART) [44], CHi-squared Automatic
data [33]. Classification algorithms in machine learning Interaction Detector(CHAID) [45], Multivariate Adaptive
contain several algorithms, and in this work, the paper focused Regression Splines (MARS) [46], Generalized, Unbiased,
on the decision tree algorithm in general. Fig. 1 illustrate a Interaction Detection and Estimation (GUIDE), Conditional
structure of DT. Inference Trees (CTREE) [47],[48], Classification Rule with
Unbiased Interaction Selection and Estimation (CRUISE),
Quick, Unbiased and Efficient Statistical Tree (QUEST) [49],
[50]. Table I shown comparison between the frequently used
algorithms for the decision tree [51].
B. Entropy and Information Gain
Entropy is employed to measure a dataset's impurity or
randomness [52], [53]. The value of entropy always lies
between 0 and 1. Its value is better when it is equal to 0 while it
is worse when it is equal to 0, i.e. the closer its value to 0 the
better. As shown in “Fig. 3”. If the target is 𝐺 with different
attribute values, the entropy of the classification of set 𝑆 with
respect to 𝑐 states [54], [55]. As shown in “equation (1)”.
21
Tijo & Abdulazeez / Journal of Applied Science and Technology Trends Vol. 02, No. 01, pp. 20 – 28 (2021)
Input variables Categorical/ Continuous Categorical/ Continuous Categorical/ Continuous Categorical/ Continuous
22
Tijo & Abdulazeez / Journal of Applied Science and Technology Trends Vol. 02, No. 01, pp. 20 – 28 (2021)
suggested that the 83.4% accuracy and decision tree classifier had their data checked. The Kaplan-Meier approach has been
had an impact on handwritten number recognition. used to determine overall survival (OS). It involved a total of
100 patients. 76.4 % and 71.3% were the 5-year and 7-year OS
De Felice et al. [66] suggested a decision tree algorithm to points. Age, comorbidity, tumor size, Clinical Tumor
recognize known and novel clinical indications before classification (CT), and clinical node classification are
treatment for survival in Locally Advanced Rectal Cancer important predictive variables for tree composition (CN). The
(LARC). The analytics showed that even non-experts in the results showed that the highest survival rates were in elderly
field, in particular classification trees, can easily interpret the patients with a tumor size of less than 5 cm and patients under
tree-based machine learning process. Validation errors need to the age of 65 years who had cT3. A decision tree is a way of
be managed to even achieve their statistical capacity. Around getting better clinical practice decision-making, based on broad
2007 and 2014, patients with histologically confirmed LARC data sets.
TABLE III: SUMMARY OF LITERATURE REVIEW RELATED OF DT ALGORITHM
Ref. Year Dataset Technique(s) Accuracy
DT , KNN , LR , SVM DT: 99.93% , KNN: 99.93% , LR: 93.13% ,
Nandhini and K.S [75] 2020 UCI
and NB SVM: 90.76% and NB: 79.52%.
Nagra et al. [79] 2020 UCI SIW-APSO-LS SIW-APSO-LS: 99.88%
decreasing computational complexity by 47.62%
Kuang et al. [71] 2020 SCBs sSCC
on average
Optic Disc (OD)
Pathan et al. [78] 2020 images OD: 99.61%.
segmentation
Batitis et al. [73] 2020 image DT DT: 89.31%
Ramadhan et al. [72] 2020 CICIDS2017 DT and KNN DT: 99.91% and KNN: 98.94%
Arowolo et al. [77] 2020 RNA-seq Malaria KNN and DT KNN: 86.7% and DT: 83.3%.
Patients with histologically proven LARC The 5 -year OS rates: 76.4% The 7 -year OS
De Felice et al. [65] 2020 Kaplan-Meier method
between 2007 and 2014 their data rates:71.3%,
Smokers of Chinese Center for Disease Control
Zhang et al. [74] 2019 DT(XGBoost) and RF DT: 84.11% and RF: 58.11%.
and Prevention
Sathiyanarayanan et al. DT: 99%.
2019 Wisconsin Breast Cancer dataset DT and KNN
[82] KNN: 97%,
UCI / OSDT: 66.90%
Hu et al. [67] 2019 UCI and COMPAS OSDT and BinOCT COMPAS/ OSDT: 82.881% and BinOCT:
76.722%
Patil and Kulkarni,
2019 UCI DST , PT and MLT DST: The best 99.9% and the worst 81.445%
[68]
Sarker et al. [67] presented a Behavioral Decision Tree decision trees, their accuracy is 76.722 %, 82.881 %,
named "BehavDT" context-aware structure that takes into respectively.
account consumer behavior-oriented generalization according
to the degree of personal choice. In exceptional cases of Patil and Kulkarni [69] introduced a Distributed Spark Tree
association, the BehavDT model provided comprehensive (DST) to better execute the DT algorithm in terms of model
decisions as well as context-specific decisions. Experiments construction time without losing accuracy. Besides, they
were carried out on real smartphone datasets of individual suggested using them in Spark's climate. Data in Spark's shared
users through the efficiency of the BehavDT model. The architecture does not perform horizontal parallel execution.
results indicated that the Behav DT context-aware model, Spark functions well and coherently in-memory computations,
whose accuracy is up to 90%, is the model that is most RDD, and map reduction. The dataset that was used from the
energetic compared to other conventional machine learning UCI ML repository and four classes were chosen. Wide data
models. files are utilized to test performance regarding model build
time for DST, PySpark (PT), and MLLib (MLT). The findings
Hu et al. [68] illustrated the first practical algorithm to showed that in terms of accuracy, DST performed better than
optimize decision trees for binary variables. The algorithm is a both PT and MLT, as its lowest value was 81.445 % and the
co-design of analytical limits involving a dedicated bit vector highest according to the scale of the dataset was 99.9 %.
library and data structures that minimize the search area and
current application technologies. They used the Binary Optimal Hussain et al. [70] offered a modern approach, namely a
Classification Trees (BinOCT) method, which is the current Pixel Label Decision Tree (PLDT), and checks whether it can
publicly available method, to assess the accuracy and compare achieve better detailed femur segmentation efficiency in DXA
it with the Optimal Sparse Decision Trees (OSDT). As well as imaging. PLDT includes extraction and selection of the trait.
they utilized text datasets from the University of California, PLDT was used to uncover secret patterns found in DXA
Irvine (UCI) Machine Learning Repository and numeric pictures in contrast to photographic images. To decide the best
datasets from the other ProPublica COMPAS datasets. The feature set for the model, PLDT generates seven new feature
sets and uses Global Threshold (GT), Region Growing
findings showed that when a COMPAS dataset, the optimal
decision tree produced by OSDT, its accuracy 66.90 %. Threshold (RGT), and Artificial Neural Networks (ANN). The
Besides, when BinOCT and OSDT generated the UCI dataset, results revealed that in segmenting DXA images, PLDT
23
Tijo & Abdulazeez / Journal of Applied Science and Technology Trends Vol. 02, No. 01, pp. 20 – 28 (2021)
exceeds other conventional partition techniques. For each Nandhini and K.S [76] discussed the effective methods of
algorithm such as this PLDT, the accuracy is 91.4%, GT is developing a machine learning model using some of the
68.4%, RGT is 76%, and ANN is 84.4%. common algorithms that can distinguish whether mail is spam
or ham. UCI's Machine Learning store was used as a dataset
Linty et al. [71] proposed a new approach that affects the for Spambase. Besides, they evaluated the output of Logistic
amplitude of signals from the Global Navigation Satellite Regression (LR), DT, Naïve Bayes (NB), KNN, and Support
System (GNSS) and was used to detect ionic scintillation Vector Machine (SVM) to construct an efficient machine
events that are concerned with accuracy, reliability, and learning model for spam. Using the Weka tool to train and
readiness. A broad collection of 50 Hz post-correlation data evaluate the data collection. The results indicated that DT
was supplied by the GNSS recipient. The outcomes showed performance is comparable to and better than KNN
that this method, in terms of accuracy and F-score, exceeds performance, and the accuracy for both of them is as follows:
state-of-the-art techniques and can achieve a human-driven DT is 99.93 percent, KNN is 99.93 %, LR is 93.13 %, SVM is
standard, which is the level of manual annotation. It improves 90.76 % and NB is 79.52 %.
greatly as it gains 98 % of identification, very similar to hand-
driven human-driven classification. Taloba and Ismail [77] developed a new machine learning
approach for the hybrid decision tree and a genetic algorithm
Kuang et al. [72] Proposed a structure based on a decision known as GADT for spam detection. The most significant
tree named Screen Content Coding ( SCC) to make a fast algorithm for enhancing decision tree efficiency is the genetic
decision in situations by testing their different features in the algorithm. Also, it is efficient and reliable for text
training sets. Moreover, to prevent the thorough search process, classification. A genetic algorithm has used the element of trust
a sequential arrangement of decision trees was illustrated. In that governs decision tree pruning to optimize and detect its
addition, SCBs were used as datasets to balance the SCC with optimum value. They used the UCI Machine Learning Store
the Intra Block Copy (IBC) and PaLeTte (PLT) modes. The spam dataset. Besides, they used the mechanism of main
results indicated that the SCC system offers a 47.62 % decrease Principle Component Analysis (PCA) to delete features that are
in computational complexity on average, with a small 1.42 % inappropriate for email message content and process them less
in Bjøntegaard delta bitrate (BDBR). frequently. The findings showed that after using PCA, the
Ramadhan et al. [73] Demonstrated a comparative analysis mixed GADT approach has an accuracy of 93.4 % before using
of accuracy and process length for each algorithm performed PCA and an accuracy of 95.5 %. This implies that the
using the K-Nearest Neighbor (KNN) and Decision Tree (DT) extraction of inappropriate characteristics has a great impact on
algorithms for the detection of DDoS attacks. Moreover, they the PCA.
used the CICIDS2017 dataset that consists of the latest attacks Arowolo et al. [78] implemented a Principle Component
and global packages, is standard and applicable to real-world
Analysis (PCA) feature extraction algorithm to decrease the
data in a PCAP format. The findings showed that the accuracy dimensions and demonstrate the high dimensions analyze
of DT to detect DDoS attacks was higher than the KNN value, evidence on gene expression. The KNN classification and DT
the accuracy of DT was 99.91 %, and the accuracy of KNN algorithm were utilized to detection various biological
was 98.94 %. structures and to Offer better value resolution as well as to
Batitis et al. [74] presented a system to identify up to 10 detect new malaria genes and prediction tests. Ribonucleic acid
irregular red blood cells and to know the accuracy rate for all (RNA-seq) sequencing is also used as a data collection. The
abnormal red blood cells. Additionally, To detect irregular red results indicated that the performance of the KNN
blood cells, they employed a DT algorithm in image processing classification is better than the DT classification in the PCA
and used frames of former patients for the scheme in hospitals. feature extraction. The accuracy of KNN reaches 86.7% while
Also, the camera was used to insert them into the software to the DT reaches 83.3%.
capture the slides. The results showed that the accuracy rate Pathan et al. [79] proposed a new technique that recognizes
averaged 89.31 % and the error rate averaged 10.69 %. and removes the blood artery for correct segmentation of the
Furthermore, the central irregularity of the Codocyte pallor was Optic Disc (OD). This is done in two ways. First, the
found to be a cause for the mistake in the classification of directional filter is used to build an efficient blood vessel
abnormal red blood cells. identification and exclusion algorithm. In the second step, to
Zhang et al. [75] Proposed a model based on the decision detect the contour of the optic disc, the decision tree classifier
tree machine learning algorithm named Extreme Gradient is utilized to achieve an adaptive threshold. As well as, two
Boosting (XGBoost) for the prediction of regular smoking separate databases were used, including 300 fundus images
time. Furthermore, to create a simulated data set for smoking obtained from Kasturba Medical College (KMC) Manipal and
time data, the Chinese Center for Disease Control and also the RIM-ONE database that is publicly accessible. The
Prevention collected people's information from smokers. Also, results showed that a fully automatic OD segmentation
they used a module for extracting feature information. To see technique that uses a decision tree classifier to achieve the
its output in the feature extraction module, they used the segmentation threshold improves the robustness of the
decision tree (XGBoost) module and Random Forest machine algorithm even for images containing exudate, vesicle atrophy,
learning algorithms. The results showed that DT efficiency is and reversals, Hence, resulting in an appropriate fractionation
higher than RF, achieving 84.11 % with DT accuracy, while of OD. The mathematical study demonstrates the effect of
58.11 % with RF accuracy. pretreatment. Therefore, the average values of accuracy
24
Tijo & Abdulazeez / Journal of Applied Science and Technology Trends Vol. 02, No. 01, pp. 20 – 28 (2021)
obtained for KMC images are 99.61% and for the RIM-ONE obtained by KNN is 97%, while DT reaches the maximum
database, the obtained average values of accuracy are 99.15%. accuracy of 99%. Therefore, a decision tree algorithm that
comes under supervised learning methods predicts the type of
Nagra et al. [80] introduced the Self-Inertia Weight cancer.
Adaptive Particle Swarm Optimization with Gradient Base
Local Search (SIW-APSO-LS) feature selection approach was IV. COMPARISON AND DISCUSSION
modified to conduct feature selection and the C4.5 decision
tree method was used as a classifier to determine the sub-sets Decision tree classification algorithms consist of several
of features given. When comparing algorithms in feature types that are used to generate DT. This is by the control of
selection problems, 16 datasets from the UCI Machine both the continuous and periodic attributes of the missing
Learning Repository were used for the experiments. The values. DT is generated by a form that is typically represented
experimental outcomes demonstrate that SIW-APSO-LS as a statistical classifier and can be used for clustering. Nodes
simplifies the collection of features by effectively decreasing and branches are included in the DT. Each node requires
the number of features picked, thus maintaining the best problems that are based on one or more properties, i.e.
precision compared to other literature selection approaches for comparing an attribute value with a constant or using other
the same test functions. In the field of attribute collection, the functions to compare more than one property. For the purpose
experimental findings showed that the proposed approach is of the decision tree, the learning data collection is sometimes
useful and the highest accuracy obtained from a total of 16 referred to as the outcome tree. In order to incorporate
datasets is 99.88% classifications in machine learning and data mining using the
DT algorithm. In the following sequence, this algorithm is
Ahmim et al. [81] proposed a new Intrusion Detection applied iteratively and the classification requires a three-stage
System (IDS) that incorporates diverse classification systems process: Construct Model (Learning), Evaluation Model
that are DT-based and rule-based concepts, namely the REP (Accuracy), and Model Use (Classification). The DT
tree, JRip algorithm, and Forest PA. In specific, the first and classification stage is based on the percentage of acquired
second approaches take data set features as inputs and information that is measured by entropy. The reach metric is
categorize the network traffic as Attack/Benign. In comparison used to describe the test characteristics for a node in the tree
to the results both the first and second classifiers for reference, and is referred to as the property selection scale (property). As
the third classifier uses the attributes of the original data a test function for the current node, the best knowledge
collection. The research findings achieved by using the property is calculated. Some studies proposed approaches to
CICIDS2017 dataset to analyze the IDS testify to their overcome the shortcomings of the DT problems so that optimal
dominance in terms of accuracy, identification rate, false alarm trees can be calculated, based on a review that was performed
rate, and time overhead relative to current state-of-the-art earlier, without detailed details and samples. DT methods have
schemes. In thorough, with 94.457%, our model has the highest shown that such problems as described above can be avoided.
DR, the highest precision with 96.665%, and the lowest FAR Furthermore, it will provide the specified dataset with an
with 1.145%, although its low computing time makes it quickly appropriate solution. According to "Table III" it was observed
implemented into a soft real-time system. that in many researches were conducted with different data sets
and the DT approach was used to resolve its weaknesses and to
Li et al. [82] provided an evidentiary decision tree to
obtain better performance. Several optimization techniques
classify the fuzzy data set and the ding entropy has been used
have been used in the study [76] to strengthen the decision tree
as an indicator of the partition rules for its construction.
on the UCI ML datasets stored; based on the assessment
Moreover, the Basic Belief Assignment (BBAs) of Iris and
findings, it was shown that the DT approach got the highest
wine Datasets are utilized to calculate the optimal splitting
accuracy which is 99.93% comparing to other techniques such
feature. The lower the entropy of Deng, the more effective the
as KNN, LR, SVM, and NB which are less performing than the
feature will be to characterize the samples. In contrast to the
DT approach. In the segmentation task, the study [79] used the
standard mixture rules employed for the combination of BBAs,
DT approach to identify and extract the blood artery for proper
the proof DT can be extended specifically to the classification.
Optic Disc (OD) segmentation, which resulted in greater
The findings showed that the implementation of the proof
results equal to 99.61%. Moreover, based on the study [69], it
DT based on conviction entropy effectively decreases the
has been shown that the DST method can also increase the DT,
complexity of the fuzzy data classification whether the patient
where both of them used PT and MLT for DT in the UCI
is either affected by the cancer type of Malign or Benign. The
datasets; it has been shown that DST is more capable to
Wisconsin Breast Cancer dataset, containing 32 attributes and
enhance DT than other techniques. Ultimately, by using UCI
569 data, was used. They were using a 10 fold cross-validation
Machine Learning Library datasets and CICIDS2017 datasets
test to identify and analyze the algorithms. The accuracy is
consisting of the latest attacks among all other datasets, DT
95% when using Wine datasets, but the accuracy obtained by
proved to be the highest, and their accuracy was the best
the Iris datasets is 98%.
performing. While the study [75] utilized the DT(XGBoost)
Sathiyanarayanan et al. [83] used the DT algorithm under and RF on the datasets of the Smokers of the Chinese Center
the supervised learning mechanism to reveal breast cancer. for Disease Control and Prevention, it was found that, again,
Breast cancer identification is conducted here and it focused on the DT approach achieved the highest accuracy; which is
data, which separates the data for the preparation and testing 84.11%. Furthermore, based on studies [73], [78], [83] using
process. The result obtained is thus contrasted between the DT and KNN in the CICIDS2017, RNA-seq Malaria, and
algorithms KNN and DT. The findings reveal that the accuracy Wisconsin Breast Cancer datasets, it was found that the DT
25
Tijo & Abdulazeez / Journal of Applied Science and Technology Trends Vol. 02, No. 01, pp. 20 – 28 (2021)
approach had the highest accuracy in all three studies. Also, its [12] D. Sharma and N. Kumar, “A Review on Machine Learning
accuracy was higher when using the CICIDS2017 datasets that Algorithms, Tasks and Applications,” vol. 6, pp. 2278–1323, Oct.
2017.
achieved 99.91% accuracy. [13] K. Pahwa and N. Agarwal, “Stock Market Analysis using Supervised
Machine Learning,” in 2019 International Conference on Machine
V. CONCLUSION Learning, Big Data, Cloud and Parallel Computing (COMITCon),
Decision tree classifiers are known for their enhanced view Faridabad, India, Feb. 2019, pp. 197–200, doi:
10.1109/COMITCon.2019.8862225.
of performance outcomes. Because of their strong precision,
[14] M. Pérez-Ortiz, S. Jiménez-Fernández, P. A. Gutiérrez, E. Alexandre,
optimized splitting parameters, and enhanced tree pruning C. Hervás-Martínez, and S. Salcedo-Sanz, “A Review of
techniques (ID3, C4.5, CART, CHAID, and QUEST) are Classification Problems and Algorithms in Renewable Energy
commonly used by all recognized data classifiers. The separate Applications,” Energies, vol. 9, no. 8, Art. no. 8, Aug. 2016, doi:
datasets are used for training samples from a huge data set, 10.3390/en9080607.
which in tum, affects the precision of the test set. Decision [15] Anuradha and G. Gupta, “A self explanatory review of decision tree
classifiers,” in International Conference on Recent Advances and
trees have several possible concerns about robustness, an Innovations in Engineering (ICRAIE-2014), Jaipur, India, May 2014,
adaptation of scalability and optimization of height. But, in pp. 1–7, doi: 10.1109/ICRAIE.2014.6909245.
contrast to other methods of data classification, decision trees [16] S. Patil and U. Kulkarni, “Accuracy Prediction for Distributed
create an efficient rule collection that is simple to understand. Decision Tree using Machine Learning approach,” in 2019 3rd
This paper reviews the most recent researches that are International Conference on Trends in Electronics and Informatics
(ICOEI), Apr. 2019, pp. 1365–1371, doi:
conducted in many areas, such as analysis of medical diseases, 10.1109/ICOEI.2019.8862580.
classification of texts, classification of user smartphones and [17] N. S. Ahmed and M. H. Sadiq, “Clarify of the random forest
images, etc. Furthermore, the details used in the algorithm in an educational field,” in 2018 International Conference
techniques/algorithms, datasets were used by the authors and on Advanced Science and Engineering (ICOASE), 2018, pp. 179–184.
achieved outcomes related to the accuracy are summarized for [18] D. Zeebaree, Gene Selection and Classification of Microarray Data
Using Convolutional Neural Network. 2018.
decision trees. Finally, the best accuracy achieved for the
[19] O. M. Salih Hassan, A. Mohsin Abdulazeez, and V. M. Tiryaki, “Gait-
decision tree algorithm is 99.93% when it uses a machine Based Human Gender Classification Using Lifting 5/3 Wavelet and
learning repository as a dataset. Principal Component Analysis,” in 2018 International Conference on
Advanced Science and Engineering (ICOASE), Duhok, Oct. 2018, pp.
173–178, doi: 10.1109/ICOASE.2018.8548909.
REFERENCES [20] R. Zebari, A. Abdulazeez, D. Zeebaree, D. Zebari, and J. Saeed, “A
Comprehensive Review of Dimensionality Reduction Techniques for
[1] D. Abdulqader, A. Mohsin Abdulazeez, and D. Zeebaree, “Machine Feature Selection and Feature Extraction,” Journal of Applied Science
Learning Supervised Algorithms of Gene Selection: A Review,” Apr. and Technology Trends, vol. 1, no. 2, pp. 56–70, 2020.
2020. [21] D. V. Patil and R. S. Bichkar, “A Hybrid Evolutionary Approach To
[2] M. W. Libbrecht and W. S. Noble, “Machine learning applications in Construct Optimal Decision Trees With Large Data Sets,” in 2006
genetics and genomics,” Nature Reviews Genetics, vol. 16, no. 6, pp. IEEE International Conference on Industrial Technology, Dec. 2006,
321–332, 2015. pp. 429–433, doi: 10.1109/ICIT.2006.372250.
[3] J. Wang, P. Neskovic, and L. N. Cooper, “Training Data Selection for [22] O. Ahmed and A. Brifcani, “Gene Expression Classification Based on
Support Vector Machines,” in Advances in Natural Computation, vol. Deep Learning,” in 2019 4th Scientific International Conference Najaf
3610, L. Wang, K. Chen, and Y. S. Ong, Eds. Berlin, Heidelberg: (SICN), Al-Najef, Iraq, Apr. 2019, pp. 145–149, doi:
Springer Berlin Heidelberg, 2005, pp. 554–564. 10.1109/SICN47020.2019.9019357.
[4] D. Maulud and A. M. Abdulazeez, “A Review on Linear Regression [23] M. A. Sulaiman, “Evaluating Data Mining Classification Methods
Comprehensive in Machine Learning,” Journal of Applied Science Performance in Internet of Things Applications,” Journal of Soft
and Technology Trends, vol. 1, no. 4, pp. 140–147, 2020. Computing and Data Mining, vol. 1, no. 2, pp. 11–25, 2020.
[5] G. Carleo et al., “Machine learning and the physical sciences,” [24] F. Yang, “An Extended Idea about Decision Trees,” in 2019
Reviews of Modern Physics, vol. 91, no. 4, p. 045002, 2019. International Conference on Computational Science and
[6] T. Hillel, M. Bierlaire, M. Elshafie, and Y. Jin, “A systematic review Computational Intelligence (CSCI), Dec. 2019, pp. 349–354, doi:
of machine learning classification methodologies for modelling 10.1109/CSCI49370.2019.00068.
passenger mode choice,” Journal of Choice Modelling, p. 100221, [25] J. Liang, Z. Qin, S. Xiao, L. Ou, and X. Lin, “Efficient and secure
2020. decision tree classification for cloud-assisted online diagnosis
[7] D. Zeebaree, H. Haron, A. Mohsin Abdulazeez, and D. Zebari, services,” IEEE Transactions on Dependable and Secure Computing,
Machine learning and Region Growing for Breast Cancer 2019.
Segmentation. 2019, p. 93. [26] A. Mohsin Abdulazeez, A. Brifcani, and Issa, “Intrusion Detection
[8] C. Feng, S. Wu, and N. Liu, “A user-centric machine learning and Attack Classifier Based on Three Techniques: A Comparative
framework for cyber security operations center,” in 2017 IEEE Study Intrusion Detection and Attack Classifier Based on Three
International Conference on Intelligence and Security Informatics Techniques: A Comparative Study 387,” Jan. 2021.
(ISI), Beijing, China, Jul. 2017, pp. 173–175, doi: [27] A. S. Eesa, Z. Orman, and A. M. A. Brifcani, “A novel feature-
10.1109/ISI.2017.8004902. selection approach based on the cuttlefish optimization algorithm for
[9] S. B. Kotsiantis, I. Zaharakis, and P. Pintelas, “Supervised machine intrusion detection systems,” Expert Systems with Applications, vol.
learning: A review of classification techniques,” Emerging artificial 42, no. 5, pp. 2670–2679, Apr. 2015, doi:
intelligence applications in computer engineering, vol. 160, no. 1, pp. 10.1016/j.eswa.2014.11.009.
3–24, 2007. [28] A. Shamim, H. Hussain, and Maqbool Uddin Shaikh, “A framework
[10] S. B. Kotsiantis, I. D. Zaharakis, and P. E. Pintelas, “Machine for generation of rules from decision tree and decision table,” in 2010
learning: a review of classification and combining techniques,” Artif International Conference on Information and Emerging Technologies,
Intell Rev, vol. 26, no. 3, pp. 159–190, Nov. 2006, doi: Jun. 2010, pp. 1–6, doi: 10.1109/ICIET.2010.5625700.
10.1007/s10462-007-9052-3. [29] A. Suresh, R. Udendhran, and M. Balamurgan, “Hybridized neural
[11] C. Surv, M. N. Murty, P. J. Flynn, A. K. Jain, and P. J. Flynn, And. network and decision tree based classifier for prognostic decision
1999. making in breast cancers,” Soft Computing, vol. 24, no. 11, pp. 7947–
7953, 2020.
26
Tijo & Abdulazeez / Journal of Applied Science and Technology Trends Vol. 02, No. 01, pp. 20 – 28 (2021)
[30] Priyanka and D. Kumar, “Decision tree classifier: a detailed survey,” [51] Y.-Y. Song and Y. Lu, “Decision tree methods: applications for
International Journal of Information and Decision Sciences, vol. 12, classification and prediction,” Shanghai archives of psychiatry, vol.
no. 3, pp. 246–269, 2020. 27, pp. 130–5, Apr. 2015, doi: 10.11919/j.issn.1002-0829.215044.
[31] A. S. Eesa, A. M. Abdulazeez, and Z. Orman, “A DIDS Based on The [52] RekhaMolala, “Entropy, Information gain and Gini Index; the crux of
Combination of Cuttlefish Algorithm and Decision Tree,” Science a Decision Tree,” Medium, Mar. 23, 2020.
Journal of University of Zakho, vol. 5, no. 4, pp. 313–318, 2017. https://blog.clairvoyantsoft.com/entropy-information-gain-and-gini-
[32] R. Kumar and R. Verma, “Classification algorithms for data mining: index-the-crux-of-a-decision-tree-99d0cdc699f4 (accessed Dec. 28,
A survey,” International Journal of Innovations in Engineering and 2020).
Technology (IJIET), vol. 1, no. 2, pp. 7–14, 2012. [53] V. Cheushev, D. A. Simovici, V. Shmerko, and S. Yanushkevich,
[33] S. S. Nikam, “A comparative study of classification techniques in data “Functional entropy and decision trees,” in Proceedings. 1998 28th
mining algorithms,” Oriental journal of computer science & IEEE International Symposium on Multiple-Valued Logic (Cat. No.
technology, vol. 8, no. 1, pp. 13–19, 2015. 98CB36138), 1998, pp. 257–262.
[34] C. Z. Janikow, “Fuzzy decision trees: issues and methods,” IEEE [54] X. Chen, Z. Yang, and W. Lou, “Fault Diagnosis of Rolling Bearing
Transactions on Systems, Man, and Cybernetics, Part B Based on the Permutation Entropy of VMD and Decision Tree,” in
(Cybernetics), vol. 28, no. 1, pp. 1–14, 1998. 2019 3rd International Conference on Electronic Information
[35] G. Stein, B. Chen, A. S. Wu, and K. A. Hua, “Decision tree classifier Technology and Computer Engineering (EITCE), Xiamen, China, Oct.
for network intrusion detection with GA-based feature selection,” in 2019, pp. 1911–1915, doi: 10.1109/EITCE47263.2019.9095187.
Proceedings of the 43rd annual Southeast regional conference- [55] C. Shang, M. Li, S. Feng, Q. Jiang, and J. Fan, “Feature selection via
Volume 2, 2005, pp. 136–141. maximizing global information gain for text classification,”
[36] I. S. Damanik, A. P. Windarto, A. Wanto, S. R. Andani, and W. Knowledge-Based Systems, vol. 54, pp. 298–309, Dec. 2013, doi:
Saputra, “Decision Tree Optimization in C4. 5 Algorithm Using 10.1016/j.knosys.2013.09.019.
Genetic Algorithm,” in Journal of Physics: Conference Series, 2019, [56] T. Maszczyk and W. Duch, “Comparison of Shannon, Renyi and
vol. 1255, no. 1, p. 012012. Tsallis entropy used in decision trees,” in International Conference on
[37] R. Barros, M. Basgalupp, A. de Carvalho, and A. Freitas, “A Survey Artificial Intelligence and Soft Computing, 2008, pp. 643–651.
of Evolutionary Algorithms for Decision-Tree Induction,” IEEE [57] L. E. Raileanu and K. Stoffel, “Theoretical Comparison between the
Transactions on Systems, Man, and Cybernetics, Part C: Applications Gini Index and Information Gain Criteria,” Annals of Mathematics
and Reviews, vol. 42, pp. 291–312, Jan. 2012, doi: and Artificial Intelligence, vol. 41, no. 1, pp. 77–93, May 2004, doi:
10.1109/TSMCC.2011.2157494. 10.1023/B:AMAI.0000018580.96245.c6.
[38] G. Gupta, “A self explanatory review of decision tree classifiers,” in [58] Y. Liu, L. Hu, F. Yan, and B. Zhang, “Information Gain with Weight
International conference on recent advances and innovations in Based Decision Tree for the Employment Forecasting of
engineering (ICRAIE-2014), 2014, pp. 1–7. Undergraduates,” in 2013 IEEE International Conference on Green
[39] S. S. Gavankar and S. D. Sawarkar, “Eager decision tree,” in 2017 Computing and Communications and IEEE Internet of Things and
2nd International Conference for Convergence in Technology (I2CT), IEEE Cyber, Physical and Social Computing, Beijing, China, Aug.
Mumbai, Apr. 2017, pp. 837–840, doi: 10.1109/I2CT.2017.8226246. 2013, pp. 2210–2213, doi: 10.1109/GreenCom-iThings-
[40] P. H. Swain and H. Hauska, “The decision tree classifier: Design and CPSCom.2013.417.
potential,” IEEE Transactions on Geoscience Electronics, vol. 15, no. [59] R. L. De Mántaras, “A distance-based attribute selection measure for
3, pp. 142–147, 1977. decision tree induction,” Machine learning, vol. 6, no. 1, pp. 81–92,
[41] A. Dey, “Machine learning algorithms: a review,” International 1991.
Journal of Computer Science and Information Technologies, vol. 7, [60] S. Taneja, C. Gupta, K. Goyal, and D. Gureja, “An enhanced k-nearest
no. 3, pp. 1174–1179, 2016. neighbor algorithm using information gain and clustering,” in 2014
[42] J. Mrva, Š. Neupauer, L. Hudec, J. Ševcech, and P. Kapec, “Decision Fourth International Conference on Advanced Computing &
Support in Medical Data Using 3D Decision Tree Visualisation,” in Communication Technologies, 2014, pp. 325–329.
2019 E-Health and Bioengineering Conference (EHB), Nov. 2019, pp. [61] Y. Zhao and Y. Zhang, “Comparison of decision tree methods for
1–4, doi: 10.1109/EHB47216.2019.8969926. finding active objects,” Advances in Space Research, vol. 41, no. 12,
[43] Y. Bengio, O. Delalleau, and C. Simard, “DECISION TREES DO pp. 1955–1959, 2008.
NOT GENERALIZE TO NEW VARIATIONS,” COMPUTATIONAL [62] K. Mittal, D. Khanduja, and P. C. Tewari, “An insight into ‘Decision
INTELLIGENCE, p. 19. Tree Analysis’”,” World Wide Journal of Multidisciplinary Research
[44] C. E. Brodley and P. E. Utgoff, “Multivariate decision trees,” and Development, vol. 3, no. 12, pp. 111–115, 2017.
Machine learning, vol. 19, no. 1, pp. 45–77, 1995. [63] Priyanka and D. Kumar, “Decision tree classifier: a detailed survey,”
[45] G. K. F. Tso and K. K. W. Yau, “Predicting electricity energy International Journal of Information and Decision Sciences, vol. 12,
consumption: A comparison of regression analysis, decision tree and no. 3, pp. 246–269, 2020.
neural networks,” Energy, vol. 32, no. 9, pp. 1761–1768, Sep. 2007, [64] Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, and H. Tang, “Predicting
doi: 10.1016/j.energy.2006.11.010. diabetes mellitus with machine learning techniques,” Frontiers in
[46] S. Singh and P. Gupta, “Comparative study ID3, cart and C4. 5 genetics, vol. 9, p. 515, 2018.
decision tree algorithm: a survey,” International Journal of Advanced [65] T. A. Assegie and P. S. Nair, “Handwritten digits recognition with
Information Science and Technology (IJAIST), vol. 27, no. 27, pp. 97– decision tree classification: a machine learning approach,”
103, 2014. International Journal of Electrical and Computer Engineering, vol. 9,
[47] L. Rokach and O. Maimon, “Top-Down Induction of Decision Trees no. 5, p. 4446, 2019.
Classifiers—A Survey,” Systems, Man, and Cybernetics, Part C: [66] F. De Felice et al., “Decision tree algorithm in locally advanced rectal
Applications and Reviews, IEEE Transactions on, vol. 35, pp. 476– cancer: an example of over-interpretation and misuse of a machine
487, Dec. 2005, doi: 10.1109/TSMCC.2004.843247. learning approach,” Journal of Cancer Research and Clinical
[48] T.-S. Lim, W.-Y. Loh, and Y.-S. Shih, “A comparison of prediction Oncology, vol. 146, no. 3, pp. 761–765, 2020.
accuracy, complexity, and training time of thirty-three old and new [67] I. H. Sarker, A. Colman, J. Han, A. I. Khan, Y. B. Abushark, and K.
classification algorithms,” Machine learning, vol. 40, no. 3, pp. 203– Salah, “Behavdt: a behavioral decision tree learning to build user-
228, 2000. centric context-aware predictive model,” Mobile Networks and
[49] W.-Y. Loh, “Fifty Years of Classification and Regression Trees,” Applications, vol. 25, no. 3, pp. 1151–1161, 2020.
International Statistical Review, vol. 82, Jun. 2014, doi: [68] X. Hu, C. Rudin, and M. Seltzer, “Optimal sparse decision trees,” in
10.1111/insr.12016. Advances in Neural Information Processing Systems, 2019, pp. 7267–
[50] S. R. Jiao, J. Song, and B. Liu, “A Review of Decision Tree 7275.
Classification Algorithms for Continuous Variables,” in Journal of [69] S. Patil and U. Kulkarni, “Accuracy Prediction for Distributed
Physics: Conference Series, 2020, vol. 1651, no. 1, p. 012083. Decision Tree using Machine Learning approach,” in 2019 3rd
International Conference on Trends in Electronics and Informatics
27
Tijo & Abdulazeez / Journal of Applied Science and Technology Trends Vol. 02, No. 01, pp. 20 – 28 (2021)
(ICOEI), Apr. 2019, pp. 1365–1371, doi: Technology and Engineering (ic-ETITE), Feb. 2020, pp. 1–4, doi:
10.1109/ICOEI.2019.8862580. 10.1109/ic-ETITE47903.2020.312.
[70] D. Hussain, M. A. Al-Antari, M. A. Al-Masni, S.-M. Han, and T.-S. [77] A. I. Taloba and S. S. I. Ismail, “An Intelligent Hybrid Technique of
Kim, “Femur segmentation in DXA imaging using a machine learning Decision Tree and Genetic Algorithm for E-Mail Spam Detection,” in
decision tree,” Journal of X-ray Science and Technology, vol. 26, no. 2019 Ninth International Conference on Intelligent Computing and
5, pp. 727–746, 2018. Information Systems (ICICIS), Dec. 2019, pp. 99–104, doi:
[71] N. Linty, A. Farasin, A. Favenza, and F. Dovis, “Detection of GNSS 10.1109/ICICIS46948.2019.9014756.
Ionospheric Scintillations Based on Machine Learning Decision [78] M. O. Arowolo, M. Adebiyi, A. Adebiyi, and O. Okesola, “PCA
Tree,” IEEE Transactions on Aerospace and Electronic Systems, vol. Model For RNA-Seq Malaria Vector Data Classification Using KNN
55, no. 1, pp. 303–317, Feb. 2019, doi: 10.1109/TAES.2018.2850385. And Decision Tree Algorithm,” in 2020 International Conference in
[72] W. Kuang, Y. Chan, S. Tsang, and W. Siu, “Machine Learning-Based Mathematics, Computer Engineering and Computer Science
Fast Intra Mode Decision for HEVC Screen Content Coding via (ICMCECS), Mar. 2020, pp. 1–8, doi:
Decision Trees,” IEEE Transactions on Circuits and Systems for 10.1109/ICMCECS47690.2020.240881.
Video Technology, vol. 30, no. 5, pp. 1481–1496, May 2020, doi: [79] S. Pathan, P. Kumar, R. Pai, and S. V. Bhandary, “Automated
10.1109/TCSVT.2019.2903547. detection of optic disc contours in fundus images using decision tree
[73] I. Ramadhan, P. Sukarno, and M. A. Nugroho, “Comparative Analysis classifier,” Biocybernetics and Biomedical Engineering, vol. 40, no. 1,
of K-Nearest Neighbor and Decision Tree in Detecting Distributed pp. 52–64, 2020.
Denial of Service,” in 2020 8th International Conference on [80] A. A. Nagra et al., “Hybrid self-inertia weight adaptive particle swarm
Information and Communication Technology (ICoICT), Yogyakarta, optimisation with local search using C4. 5 decision tree classifier for
Indonesia, Jun. 2020, pp. 1–4, doi: feature selection problems,” Connection Science, vol. 32, no. 1, pp.
10.1109/ICoICT49345.2020.9166380. 16–36, 2020.
[74] V. M. E. Batitis, M. J. G. Caballes, A. A. Ciudad, M. D. Diaz, R. D. [81] A. Ahmim, L. Maglaras, M. A. Ferrag, M. Derdour, and H. Janicke,
Flores, and E. R. E. Tolentin, “Image Classification of Abnormal Red “A novel hierarchical intrusion detection system based on decision
Blood Cells Using Decision Tree Algorithm,” in 2020 Fourth tree and rules-based models,” in 2019 15th International Conference
International Conference on Computing Methodologies and on Distributed Computing in Sensor Systems (DCOSS), 2019, pp.
Communication (ICCMC), Mar. 2020, pp. 498–504, doi: 228–233.
10.1109/ICCMC48092.2020.ICCMC-00093. [82] M. Li, H. Xu, and Y. Deng, “Evidential decision tree based on belief
[75] Y. Zhang, J. Liu, Z. Zhang, and J. Huang, “Prediction of Daily entropy,” Entropy, vol. 21, no. 9, p. 897, 2019.
Smoking Behavior Based on Decision Tree Machine Learning [83] P. Sathiyanarayanan, S. Pavithra, M. S. SARANYA, and M.
Algorithm,” in 2019 IEEE 9th International Conference on Makeswari, “Identification of Breast Cancer Using The Decision Tree
Electronics Information and Emergency Communication (ICEIEC), Algorithm,” in 2019 IEEE International Conference on System,
Jul. 2019, pp. 330–333, doi: 10.1109/ICEIEC.2019.8784698. Computation, Automation and Networking (ICSCAN), 2019, pp. 1–6.
[76] S. Nandhini and J. M. K.S, “Performance Evaluation of Machine
Learning Algorithms for Email Spam Detection,” in 2020
International Conference on Emerging Trends in Information
28