0% found this document useful (0 votes)

63 views

Classification Basedon Decision Tree Algorithm

This document summarizes a research article that examines the use of decision tree algorithms for machine learning classification tasks. Decision trees are commonly used for classification problems in fields like medical diagnosis, text classification, and image recognition. The paper provides an overview of decision tree algorithms and compares different types, including ID3, C4.5, CART, CHAID, and MARS. It also reviews recent literature applying decision tree methods for classification and discusses the approaches, datasets, and results achieved.

Uploaded by

AYMAN DAOUDI

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views

Classification Basedon Decision Tree Algorithm

Uploaded by

AYMAN DAOUDI

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/350386944

Classiﬁcation Based on Decision Tree Algorithm for Machine Learning

Article in Journal of Applied Science and Technology Trends · January 2021

CITATIONS READS
89 21,770

2 authors, including:

Adnan Mohsin Abdulazeez

Duhok Polytechnic University
202 PUBLICATIONS 4,811 CITATIONS

SEE PROFILE

All content following this page was uploaded by Adnan Mohsin Abdulazeez on 25 March 2021.

The user has requested enhancement of the downloaded file.

Vol. 02, No. 01, pp. 20 – 28 (2021)
ISSN: 2708-0757

JOURNAL OF APPLIED SCIENCE AND TECHNOLOGY TRENDS

www.jastt.org

Classification Based on Decision Tree Algorithm for

Machine Learning

Bahzad Taha Jijo1*, Adnan Mohsin Abdulazeez2

1
IT Department, Technical College of Informatics Akre, Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq,
[email protected]
2
Presidency of Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq, [email protected]
*Correspondence: [email protected]

Abstract
Decision tree classifiers are regarded to be a standout of the most well-known methods to data classification representation of classifiers.
Different researchers from various fields and backgrounds have considered the problem of extending a decision tree from available
data, such as machine study, pattern recognition, and statistics. In various fields such as medical disease analysis, text classification,
user smartphone classification, images, and many more the employment of Decision tree classifiers has been proposed in many ways.
This paper provides a detailed approach to the decision trees. Furthermore, paper specifics, such as algorithms/approaches used,
datasets, and outcomes achieved, are evaluated and outlined comprehensively. In addition, all of the approaches analyzed were
discussed to illustrate the themes of the authors and identify the most accurate classifiers. As a result, the uses of different types of
datasets are discussed and their findings are analyzed.

Keywords: Machine Learning, Supervised, Classification, Decision Tree.

Received: January 11th, 2021 / Accepted: March 15th, 2021 / Online: March 24th, 2021

classification fulfillments; model development and model

I. INTRODUCTION evaluation [6, 7].
Nowadays, technology has developed a lot, especially in
the field of Machine Learning (ML), which is useful for Using the same set of attributes, any instance in every
reducing human work. In the field of artificial intelligence, ML dataset used by ML algorithms is described. The attributes
integrates statistics and computer science to build algorithms could be continuous, categorical, or binary [8, 9]. If cases are
that get more efficient when they are subject to relevant data recognized with recognized labels (correct outputs), then
rather than being given specific instructions. Besides speech learning is called supervised [10, 11]. Supervised Learning is
recognition, image detection, text localization, etc. ML is the the role of inferring a function from classified training data is
study of computational algorithms that are enhanced from machine learning. It also analyzes the testing data and creates a
experience automatically. It is considered as an artificial derived task that can be used for new examples to map [12,
intelligence subset [1, 2]. Orderly to produce foretelling or 13]. Each data input object, however, has a class label pre-
decision without being specifically programmed to do so, ML assigned. The primary function of supervised algorithms is to
algorithms create a model population based on a sample, learn a model that creates the same labeling preferably for the
defined as 'training data' [3, 4]. In a broad area of applications, data offered and popularizes well on unseen data. This is the
like email filtering and computer vision, ML algorithms are major aim of algorithms for classification [14].
utilized where it is difficult or impractical to create traditional Classification attempts to predict the goal class with the
algorithms to implement functions required [5]. For ML, there highest precision. The classification algorithm finds out the
are many uses, the most prominent of which is predictive data relation between the input attribute and the output attribute to
mining. Two major mechanisms can be broken into ML construct a model that is a

20
doi: 10.38094/jastt20165
Tijo & Abdulazeez / Journal of Applied Science and Technology Trends Vol. 02, No. 01, pp. 20 – 28 (2021)

training process [15 - 17]. The amount of data obtained in data threshold value in each test [36]. The conceptual rules are
mining environments is huge [18 - 20]. If the data set is much easier to construct than the numerical weights in the
properly classified and contains the minimum number of neural network of connections between nodes [37, 38]. Mainly
nodes, then using the decision tree method is optimal [21 - 23]. for grouping purposes, DT is used. Moreover, DT is a usually
utilized classification model in Data Mining [39]. The nodes
A decision tree is a tree-based technique in which any path and branches are composed of each tree. Each node represents
beginning from the root is described by a data separating features in a category to be classified and each subset defines a
sequence until a Boolean outcome at the leaf node is achieved value that can be taken by the node [40, 41]. Because of their
[24 - 27]. It is the hierarchical exemplification of knowledge simple analysis and their precision on multiple data forms,
relationships that contain nodes and connections. When decision trees have found many implementation fields [42].
relations are used to classify, nodes represent purposes [28 - Fig. 2 show an example of DT.
31].
In this paper, a comprehensive review is performed for the
latest and most efficient approaches that have been performed
by researchers in the past three years about decision trees in
different areas of machine learning. Also, the details of this
method, such as using algorithms/approaches, datasets, and the
findings achieved are summarized. In addition, this study
highlighted the most commonly used approaches and the
highest accuracy methods achieved.
The organization of the remaining paper is as follows:
Section II contains the decision tree algorithm mentioning its
types, benefits, and drawbacks; Section III gives a Literature
Review on decision tree Algorithm; Section IV comparison
and discussion on the decision tree, and the last section
contains the conclusion.
II. DECISION TREE ALGORITHM Fig. 2. Example on Decision Tree [43]
One of the widely used techniques in data mining is
systems that create classifiers [32]. In data mining,
classification algorithms are capable of handling a vast volume A. Types of Decision Tree Algorithms
of information. It can be used to make assumptions regarding There are several Types of DT algorithms such as: Iterative
categorical class names, to classify knowledge on the basis of Dichotomies 3 (ID3), Successor of ID3 (C4.5), Classification
training sets and class labels, and to classify newly obtainable And Regression Tree(CART) [44], CHi-squared Automatic
data [33]. Classification algorithms in machine learning Interaction Detector(CHAID) [45], Multivariate Adaptive
contain several algorithms, and in this work, the paper focused Regression Splines (MARS) [46], Generalized, Unbiased,
on the decision tree algorithm in general. Fig. 1 illustrate a Interaction Detection and Estimation (GUIDE), Conditional
structure of DT. Inference Trees (CTREE) [47],[48], Classification Rule with
Unbiased Interaction Selection and Estimation (CRUISE),
Quick, Unbiased and Efficient Statistical Tree (QUEST) [49],
[50]. Table I shown comparison between the frequently used
algorithms for the decision tree [51].
B. Entropy and Information Gain
Entropy is employed to measure a dataset's impurity or
randomness [52], [53]. The value of entropy always lies
between 0 and 1. Its value is better when it is equal to 0 while it
is worse when it is equal to 0, i.e. the closer its value to 0 the
better. As shown in “Fig. 3”. If the target is 𝐺 with different
attribute values, the entropy of the classification of set 𝑆 with
respect to 𝑐 states [54], [55]. As shown in “equation (1)”.

Fig. 1. Decision Tree [34] Entropy( 𝑆 ) = ∑𝑐𝑖=1 P𝑖 log 2Pi (1)

Decision trees are one of the powerful methods commonly

used in various fields, such as machine learning, image Where 𝑃𝑖 is the ratio of the sample number of the subset and
processing, and identification of patterns [35]. DT are a the 𝑖-th attribute value.
successive model that unites a series of the basic test efficiently
and cohesively where a numeric feature is compared to a

21
Tijo & Abdulazeez / Journal of Applied Science and Technology Trends Vol. 02, No. 01, pp. 20 – 28 (2021)

TABLE I: COMPARISON BETWEEN THE MOST USED ALGORITHMS IN DT

Methods CART C4.5 CHAID QUEST

Chi-square for categorical variables; J-

The measure used for Gini index; Twoing
Entropy info-gain Chi-square way ANOVA for continuous/ordinal
input variable collection criteria
variables

Pre-pruning using Chi-

Pre-pruning using a Pre-pruning using a single-pass
Pruning square test for Post-pruning
single-pass algorithm algorithm
independence

Dependent variable Categorical/ Continuous Categorical/ Continuous Categorical Categorical

Input variables Categorical/ Continuous Categorical/ Continuous Categorical/ Continuous Categorical/ Continuous

Binary; Split on linear

Split at each node Multiple Multiple Binary; Split on linear combinations
combinations

TABLE II: BENEFITS AND DRAWBACKS OF DT

Benefits Drawbacks
1) Simple to comprehend. 1) The optimal decision-
2) Quickly translated to a set of making mechanism can be
principle for production. deterred and incorrect
3) Can classify both categorical decisions can follow.
and numerical outcomes, but the 2) There are lots of layers in
the decision tree, which
attribute generated must be
makes it interesting.
categorical. 3) For more training samples,
4) No a priori hypothesizes are the decision tree's
taken with consideration to the calculation complexity may
goodness of the results. increase.

III. LITERATURE REVIEW

Fig.3. Value of the entropy [56]
A decision tree was used in several machine learning and
data mining tasks as a classifier. In this study discuss several
recent works about the DT. The kinds of Literature Review on
Information gain is one metric used for segmentation and is DT approaches are summarized in Table III.
often called mutual information. This intuitively informs how
much knowledge of a random variable's value [57, 58]. It’s the Zou et al. [64] Utilized a decision tree (j48), Random
opposite of entropy, the higher its value is the better. The data Forest (RF), and neural network algorithms for diabetes
gain 𝐺𝑎𝑖𝑛( 𝑆 , 𝐴 ) is defined as the following on the definition mellitus prediction. The dataset is physical research data for
of entropy [59, 60], as shown in “equation (2)”. hospitals in Luzhou, China. There are 14 characteristics
involved. Training array randomly extracts data from 68994
|Sv |
stable human and diabetic patients, respectively. They used the
Gain( S , A ) = ∑v ∈ V(A) |S|
Entropy( Sv ) (2) full significance of minimum Redundancy Maximum
Relevance (mRMR) and Principal Component Analysis (PCA)
Where the range of attribute 𝐴 is 𝑉(𝐴), and 𝑆𝑣 is a subset of to minimize dimensionality. In some ways, the effects of RF,
set 𝑆 equal to the attribute value of attribute 𝑣 [58]. as opposed to each other, seemed to be higher than the other
classifiers. Also, 0.8084 is the best outcome in the Luzhou data
collection.
C. Benefits and Drawbacks of decision tree
Assegie and Nair [65], Utilized the DT classification
The DT algorithm is part of the supervised learning
process to classify the handwritten digits of the standard data
algorithm family, and its main objective is to construct a
set of kaggle digits and estimate the accuracy of the model for
training model that can be used to predict the class or value of
each digit from 0 to 9. The kaggle features include 42,000 rows
target variables through learning decision rules inferred from
and 720 columns used for machine training, vector features are
the training data. The DT algorithm can be used to solve
used for pixels of digital images. They used a highly efficient
regression and classification problems, but it has benefits and
language named "python programming" for the application of
drawbacks [61 - 63], which are summarized in Table II.
machine learning algorithms to map the classifier's success rate
graph in the realization of handwritten digits. The findings

22
Tijo & Abdulazeez / Journal of Applied Science and Technology Trends Vol. 02, No. 01, pp. 20 – 28 (2021)

suggested that the 83.4% accuracy and decision tree classifier had their data checked. The Kaplan-Meier approach has been
had an impact on handwritten number recognition. used to determine overall survival (OS). It involved a total of
100 patients. 76.4 % and 71.3% were the 5-year and 7-year OS
De Felice et al. [66] suggested a decision tree algorithm to points. Age, comorbidity, tumor size, Clinical Tumor
recognize known and novel clinical indications before classification (CT), and clinical node classification are
treatment for survival in Locally Advanced Rectal Cancer important predictive variables for tree composition (CN). The
(LARC). The analytics showed that even non-experts in the results showed that the highest survival rates were in elderly
field, in particular classification trees, can easily interpret the patients with a tumor size of less than 5 cm and patients under
tree-based machine learning process. Validation errors need to the age of 65 years who had cT3. A decision tree is a way of
be managed to even achieve their statistical capacity. Around getting better clinical practice decision-making, based on broad
2007 and 2014, patients with histologically confirmed LARC data sets.
TABLE III: SUMMARY OF LITERATURE REVIEW RELATED OF DT ALGORITHM
Ref. Year Dataset Technique(s) Accuracy
DT , KNN , LR , SVM DT: 99.93% , KNN: 99.93% , LR: 93.13% ,
Nandhini and K.S [75] 2020 UCI
and NB SVM: 90.76% and NB: 79.52%.
Nagra et al. [79] 2020 UCI SIW-APSO-LS SIW-APSO-LS: 99.88%
decreasing computational complexity by 47.62%
Kuang et al. [71] 2020 SCBs sSCC
on average
Optic Disc (OD)
Pathan et al. [78] 2020 images OD: 99.61%.
segmentation
Batitis et al. [73] 2020 image DT DT: 89.31%
Ramadhan et al. [72] 2020 CICIDS2017 DT and KNN DT: 99.91% and KNN: 98.94%
Arowolo et al. [77] 2020 RNA-seq Malaria KNN and DT KNN: 86.7% and DT: 83.3%.
Patients with histologically proven LARC The 5 -year OS rates: 76.4% The 7 -year OS
De Felice et al. [65] 2020 Kaplan-Meier method
between 2007 and 2014 their data rates:71.3%,
Smokers of Chinese Center for Disease Control
Zhang et al. [74] 2019 DT(XGBoost) and RF DT: 84.11% and RF: 58.11%.
and Prevention
Sathiyanarayanan et al. DT: 99%.
2019 Wisconsin Breast Cancer dataset DT and KNN
[82] KNN: 97%,
UCI / OSDT: 66.90%
Hu et al. [67] 2019 UCI and COMPAS OSDT and BinOCT COMPAS/ OSDT: 82.881% and BinOCT:
76.722%
Patil and Kulkarni,
2019 UCI DST , PT and MLT DST: The best 99.9% and the worst 81.445%
[68]

Sarker et al. [67] presented a Behavioral Decision Tree decision trees, their accuracy is 76.722 %, 82.881 %,
named "BehavDT" context-aware structure that takes into respectively.
account consumer behavior-oriented generalization according
to the degree of personal choice. In exceptional cases of Patil and Kulkarni [69] introduced a Distributed Spark Tree
association, the BehavDT model provided comprehensive (DST) to better execute the DT algorithm in terms of model
decisions as well as context-specific decisions. Experiments construction time without losing accuracy. Besides, they
were carried out on real smartphone datasets of individual suggested using them in Spark's climate. Data in Spark's shared
users through the efficiency of the BehavDT model. The architecture does not perform horizontal parallel execution.
results indicated that the Behav DT context-aware model, Spark functions well and coherently in-memory computations,
whose accuracy is up to 90%, is the model that is most RDD, and map reduction. The dataset that was used from the
energetic compared to other conventional machine learning UCI ML repository and four classes were chosen. Wide data
models. files are utilized to test performance regarding model build
time for DST, PySpark (PT), and MLLib (MLT). The findings
Hu et al. [68] illustrated the first practical algorithm to showed that in terms of accuracy, DST performed better than
optimize decision trees for binary variables. The algorithm is a both PT and MLT, as its lowest value was 81.445 % and the
co-design of analytical limits involving a dedicated bit vector highest according to the scale of the dataset was 99.9 %.
library and data structures that minimize the search area and
current application technologies. They used the Binary Optimal Hussain et al. [70] offered a modern approach, namely a
Classification Trees (BinOCT) method, which is the current Pixel Label Decision Tree (PLDT), and checks whether it can
publicly available method, to assess the accuracy and compare achieve better detailed femur segmentation efficiency in DXA
it with the Optimal Sparse Decision Trees (OSDT). As well as imaging. PLDT includes extraction and selection of the trait.
they utilized text datasets from the University of California, PLDT was used to uncover secret patterns found in DXA
Irvine (UCI) Machine Learning Repository and numeric pictures in contrast to photographic images. To decide the best
datasets from the other ProPublica COMPAS datasets. The feature set for the model, PLDT generates seven new feature
sets and uses Global Threshold (GT), Region Growing
findings showed that when a COMPAS dataset, the optimal
decision tree produced by OSDT, its accuracy 66.90 %. Threshold (RGT), and Artificial Neural Networks (ANN). The
Besides, when BinOCT and OSDT generated the UCI dataset, results revealed that in segmenting DXA images, PLDT

23
Tijo & Abdulazeez / Journal of Applied Science and Technology Trends Vol. 02, No. 01, pp. 20 – 28 (2021)

exceeds other conventional partition techniques. For each Nandhini and K.S [76] discussed the effective methods of
algorithm such as this PLDT, the accuracy is 91.4%, GT is developing a machine learning model using some of the
68.4%, RGT is 76%, and ANN is 84.4%. common algorithms that can distinguish whether mail is spam
or ham. UCI's Machine Learning store was used as a dataset
Linty et al. [71] proposed a new approach that affects the for Spambase. Besides, they evaluated the output of Logistic
amplitude of signals from the Global Navigation Satellite Regression (LR), DT, Naïve Bayes (NB), KNN, and Support
System (GNSS) and was used to detect ionic scintillation Vector Machine (SVM) to construct an efficient machine
events that are concerned with accuracy, reliability, and learning model for spam. Using the Weka tool to train and
readiness. A broad collection of 50 Hz post-correlation data evaluate the data collection. The results indicated that DT
was supplied by the GNSS recipient. The outcomes showed performance is comparable to and better than KNN
that this method, in terms of accuracy and F-score, exceeds performance, and the accuracy for both of them is as follows:
state-of-the-art techniques and can achieve a human-driven DT is 99.93 percent, KNN is 99.93 %, LR is 93.13 %, SVM is
standard, which is the level of manual annotation. It improves 90.76 % and NB is 79.52 %.
greatly as it gains 98 % of identification, very similar to hand-
driven human-driven classification. Taloba and Ismail [77] developed a new machine learning
approach for the hybrid decision tree and a genetic algorithm
Kuang et al. [72] Proposed a structure based on a decision known as GADT for spam detection. The most significant
tree named Screen Content Coding ( SCC) to make a fast algorithm for enhancing decision tree efficiency is the genetic
decision in situations by testing their different features in the algorithm. Also, it is efficient and reliable for text
training sets. Moreover, to prevent the thorough search process, classification. A genetic algorithm has used the element of trust
a sequential arrangement of decision trees was illustrated. In that governs decision tree pruning to optimize and detect its
addition, SCBs were used as datasets to balance the SCC with optimum value. They used the UCI Machine Learning Store
the Intra Block Copy (IBC) and PaLeTte (PLT) modes. The spam dataset. Besides, they used the mechanism of main
results indicated that the SCC system offers a 47.62 % decrease Principle Component Analysis (PCA) to delete features that are
in computational complexity on average, with a small 1.42 % inappropriate for email message content and process them less
in Bjøntegaard delta bitrate (BDBR). frequently. The findings showed that after using PCA, the
Ramadhan et al. [73] Demonstrated a comparative analysis mixed GADT approach has an accuracy of 93.4 % before using
of accuracy and process length for each algorithm performed PCA and an accuracy of 95.5 %. This implies that the
using the K-Nearest Neighbor (KNN) and Decision Tree (DT) extraction of inappropriate characteristics has a great impact on
algorithms for the detection of DDoS attacks. Moreover, they the PCA.
used the CICIDS2017 dataset that consists of the latest attacks Arowolo et al. [78] implemented a Principle Component
and global packages, is standard and applicable to real-world
Analysis (PCA) feature extraction algorithm to decrease the
data in a PCAP format. The findings showed that the accuracy dimensions and demonstrate the high dimensions analyze
of DT to detect DDoS attacks was higher than the KNN value, evidence on gene expression. The KNN classification and DT
the accuracy of DT was 99.91 %, and the accuracy of KNN algorithm were utilized to detection various biological
was 98.94 %. structures and to Offer better value resolution as well as to
Batitis et al. [74] presented a system to identify up to 10 detect new malaria genes and prediction tests. Ribonucleic acid
irregular red blood cells and to know the accuracy rate for all (RNA-seq) sequencing is also used as a data collection. The
abnormal red blood cells. Additionally, To detect irregular red results indicated that the performance of the KNN
blood cells, they employed a DT algorithm in image processing classification is better than the DT classification in the PCA
and used frames of former patients for the scheme in hospitals. feature extraction. The accuracy of KNN reaches 86.7% while
Also, the camera was used to insert them into the software to the DT reaches 83.3%.
capture the slides. The results showed that the accuracy rate Pathan et al. [79] proposed a new technique that recognizes
averaged 89.31 % and the error rate averaged 10.69 %. and removes the blood artery for correct segmentation of the
Furthermore, the central irregularity of the Codocyte pallor was Optic Disc (OD). This is done in two ways. First, the
found to be a cause for the mistake in the classification of directional filter is used to build an efficient blood vessel
abnormal red blood cells. identification and exclusion algorithm. In the second step, to
Zhang et al. [75] Proposed a model based on the decision detect the contour of the optic disc, the decision tree classifier
tree machine learning algorithm named Extreme Gradient is utilized to achieve an adaptive threshold. As well as, two
Boosting (XGBoost) for the prediction of regular smoking separate databases were used, including 300 fundus images
time. Furthermore, to create a simulated data set for smoking obtained from Kasturba Medical College (KMC) Manipal and
time data, the Chinese Center for Disease Control and also the RIM-ONE database that is publicly accessible. The
Prevention collected people's information from smokers. Also, results showed that a fully automatic OD segmentation
they used a module for extracting feature information. To see technique that uses a decision tree classifier to achieve the
its output in the feature extraction module, they used the segmentation threshold improves the robustness of the
decision tree (XGBoost) module and Random Forest machine algorithm even for images containing exudate, vesicle atrophy,
learning algorithms. The results showed that DT efficiency is and reversals, Hence, resulting in an appropriate fractionation
higher than RF, achieving 84.11 % with DT accuracy, while of OD. The mathematical study demonstrates the effect of
58.11 % with RF accuracy. pretreatment. Therefore, the average values of accuracy

24
Tijo & Abdulazeez / Journal of Applied Science and Technology Trends Vol. 02, No. 01, pp. 20 – 28 (2021)

obtained for KMC images are 99.61% and for the RIM-ONE obtained by KNN is 97%, while DT reaches the maximum
database, the obtained average values of accuracy are 99.15%. accuracy of 99%. Therefore, a decision tree algorithm that
comes under supervised learning methods predicts the type of
Nagra et al. [80] introduced the Self-Inertia Weight cancer.
Adaptive Particle Swarm Optimization with Gradient Base
Local Search (SIW-APSO-LS) feature selection approach was IV. COMPARISON AND DISCUSSION
modified to conduct feature selection and the C4.5 decision
tree method was used as a classifier to determine the sub-sets Decision tree classification algorithms consist of several
of features given. When comparing algorithms in feature types that are used to generate DT. This is by the control of
selection problems, 16 datasets from the UCI Machine both the continuous and periodic attributes of the missing
Learning Repository were used for the experiments. The values. DT is generated by a form that is typically represented
experimental outcomes demonstrate that SIW-APSO-LS as a statistical classifier and can be used for clustering. Nodes
simplifies the collection of features by effectively decreasing and branches are included in the DT. Each node requires
the number of features picked, thus maintaining the best problems that are based on one or more properties, i.e.
precision compared to other literature selection approaches for comparing an attribute value with a constant or using other
the same test functions. In the field of attribute collection, the functions to compare more than one property. For the purpose
experimental findings showed that the proposed approach is of the decision tree, the learning data collection is sometimes
useful and the highest accuracy obtained from a total of 16 referred to as the outcome tree. In order to incorporate
datasets is 99.88% classifications in machine learning and data mining using the
DT algorithm. In the following sequence, this algorithm is
Ahmim et al. [81] proposed a new Intrusion Detection applied iteratively and the classification requires a three-stage
System (IDS) that incorporates diverse classification systems process: Construct Model (Learning), Evaluation Model
that are DT-based and rule-based concepts, namely the REP (Accuracy), and Model Use (Classification). The DT
tree, JRip algorithm, and Forest PA. In specific, the first and classification stage is based on the percentage of acquired
second approaches take data set features as inputs and information that is measured by entropy. The reach metric is
categorize the network traffic as Attack/Benign. In comparison used to describe the test characteristics for a node in the tree
to the results both the first and second classifiers for reference, and is referred to as the property selection scale (property). As
the third classifier uses the attributes of the original data a test function for the current node, the best knowledge
collection. The research findings achieved by using the property is calculated. Some studies proposed approaches to
CICIDS2017 dataset to analyze the IDS testify to their overcome the shortcomings of the DT problems so that optimal
dominance in terms of accuracy, identification rate, false alarm trees can be calculated, based on a review that was performed
rate, and time overhead relative to current state-of-the-art earlier, without detailed details and samples. DT methods have
schemes. In thorough, with 94.457%, our model has the highest shown that such problems as described above can be avoided.
DR, the highest precision with 96.665%, and the lowest FAR Furthermore, it will provide the specified dataset with an
with 1.145%, although its low computing time makes it quickly appropriate solution. According to "Table III" it was observed
implemented into a soft real-time system. that in many researches were conducted with different data sets
and the DT approach was used to resolve its weaknesses and to
Li et al. [82] provided an evidentiary decision tree to
obtain better performance. Several optimization techniques
classify the fuzzy data set and the ding entropy has been used
have been used in the study [76] to strengthen the decision tree
as an indicator of the partition rules for its construction.
on the UCI ML datasets stored; based on the assessment
Moreover, the Basic Belief Assignment (BBAs) of Iris and
findings, it was shown that the DT approach got the highest
wine Datasets are utilized to calculate the optimal splitting
accuracy which is 99.93% comparing to other techniques such
feature. The lower the entropy of Deng, the more effective the
as KNN, LR, SVM, and NB which are less performing than the
feature will be to characterize the samples. In contrast to the
DT approach. In the segmentation task, the study [79] used the
standard mixture rules employed for the combination of BBAs,
DT approach to identify and extract the blood artery for proper
the proof DT can be extended specifically to the classification.
Optic Disc (OD) segmentation, which resulted in greater
The findings showed that the implementation of the proof
results equal to 99.61%. Moreover, based on the study [69], it
DT based on conviction entropy effectively decreases the
has been shown that the DST method can also increase the DT,
complexity of the fuzzy data classification whether the patient
where both of them used PT and MLT for DT in the UCI
is either affected by the cancer type of Malign or Benign. The
datasets; it has been shown that DST is more capable to
Wisconsin Breast Cancer dataset, containing 32 attributes and
enhance DT than other techniques. Ultimately, by using UCI
569 data, was used. They were using a 10 fold cross-validation
Machine Learning Library datasets and CICIDS2017 datasets
test to identify and analyze the algorithms. The accuracy is
consisting of the latest attacks among all other datasets, DT
95% when using Wine datasets, but the accuracy obtained by
proved to be the highest, and their accuracy was the best
the Iris datasets is 98%.
performing. While the study [75] utilized the DT(XGBoost)
Sathiyanarayanan et al. [83] used the DT algorithm under and RF on the datasets of the Smokers of the Chinese Center
the supervised learning mechanism to reveal breast cancer. for Disease Control and Prevention, it was found that, again,
Breast cancer identification is conducted here and it focused on the DT approach achieved the highest accuracy; which is
data, which separates the data for the preparation and testing 84.11%. Furthermore, based on studies [73], [78], [83] using
process. The result obtained is thus contrasted between the DT and KNN in the CICIDS2017, RNA-seq Malaria, and
algorithms KNN and DT. The findings reveal that the accuracy Wisconsin Breast Cancer datasets, it was found that the DT

25
Tijo & Abdulazeez / Journal of Applied Science and Technology Trends Vol. 02, No. 01, pp. 20 – 28 (2021)

approach had the highest accuracy in all three studies. Also, its [12] D. Sharma and N. Kumar, “A Review on Machine Learning
accuracy was higher when using the CICIDS2017 datasets that Algorithms, Tasks and Applications,” vol. 6, pp. 2278–1323, Oct.
2017.
achieved 99.91% accuracy. [13] K. Pahwa and N. Agarwal, “Stock Market Analysis using Supervised
Machine Learning,” in 2019 International Conference on Machine
V. CONCLUSION Learning, Big Data, Cloud and Parallel Computing (COMITCon),
Decision tree classifiers are known for their enhanced view Faridabad, India, Feb. 2019, pp. 197–200, doi:
10.1109/COMITCon.2019.8862225.
of performance outcomes. Because of their strong precision,
[14] M. Pérez-Ortiz, S. Jiménez-Fernández, P. A. Gutiérrez, E. Alexandre,
optimized splitting parameters, and enhanced tree pruning C. Hervás-Martínez, and S. Salcedo-Sanz, “A Review of
techniques (ID3, C4.5, CART, CHAID, and QUEST) are Classification Problems and Algorithms in Renewable Energy
commonly used by all recognized data classifiers. The separate Applications,” Energies, vol. 9, no. 8, Art. no. 8, Aug. 2016, doi:
datasets are used for training samples from a huge data set, 10.3390/en9080607.
which in tum, affects the precision of the test set. Decision [15] Anuradha and G. Gupta, “A self explanatory review of decision tree
classifiers,” in International Conference on Recent Advances and
trees have several possible concerns about robustness, an Innovations in Engineering (ICRAIE-2014), Jaipur, India, May 2014,
adaptation of scalability and optimization of height. But, in pp. 1–7, doi: 10.1109/ICRAIE.2014.6909245.
contrast to other methods of data classification, decision trees [16] S. Patil and U. Kulkarni, “Accuracy Prediction for Distributed
create an efficient rule collection that is simple to understand. Decision Tree using Machine Learning approach,” in 2019 3rd
This paper reviews the most recent researches that are International Conference on Trends in Electronics and Informatics
(ICOEI), Apr. 2019, pp. 1365–1371, doi:
conducted in many areas, such as analysis of medical diseases, 10.1109/ICOEI.2019.8862580.
classification of texts, classification of user smartphones and [17] N. S. Ahmed and M. H. Sadiq, “Clarify of the random forest
images, etc. Furthermore, the details used in the algorithm in an educational field,” in 2018 International Conference
techniques/algorithms, datasets were used by the authors and on Advanced Science and Engineering (ICOASE), 2018, pp. 179–184.
achieved outcomes related to the accuracy are summarized for [18] D. Zeebaree, Gene Selection and Classification of Microarray Data
Using Convolutional Neural Network. 2018.
decision trees. Finally, the best accuracy achieved for the
[19] O. M. Salih Hassan, A. Mohsin Abdulazeez, and V. M. Tiryaki, “Gait-
decision tree algorithm is 99.93% when it uses a machine Based Human Gender Classification Using Lifting 5/3 Wavelet and
learning repository as a dataset. Principal Component Analysis,” in 2018 International Conference on
Advanced Science and Engineering (ICOASE), Duhok, Oct. 2018, pp.
173–178, doi: 10.1109/ICOASE.2018.8548909.
REFERENCES [20] R. Zebari, A. Abdulazeez, D. Zeebaree, D. Zebari, and J. Saeed, “A
Comprehensive Review of Dimensionality Reduction Techniques for
[1] D. Abdulqader, A. Mohsin Abdulazeez, and D. Zeebaree, “Machine Feature Selection and Feature Extraction,” Journal of Applied Science
Learning Supervised Algorithms of Gene Selection: A Review,” Apr. and Technology Trends, vol. 1, no. 2, pp. 56–70, 2020.
2020. [21] D. V. Patil and R. S. Bichkar, “A Hybrid Evolutionary Approach To
[2] M. W. Libbrecht and W. S. Noble, “Machine learning applications in Construct Optimal Decision Trees With Large Data Sets,” in 2006
genetics and genomics,” Nature Reviews Genetics, vol. 16, no. 6, pp. IEEE International Conference on Industrial Technology, Dec. 2006,
321–332, 2015. pp. 429–433, doi: 10.1109/ICIT.2006.372250.
[3] J. Wang, P. Neskovic, and L. N. Cooper, “Training Data Selection for [22] O. Ahmed and A. Brifcani, “Gene Expression Classification Based on
Support Vector Machines,” in Advances in Natural Computation, vol. Deep Learning,” in 2019 4th Scientific International Conference Najaf
3610, L. Wang, K. Chen, and Y. S. Ong, Eds. Berlin, Heidelberg: (SICN), Al-Najef, Iraq, Apr. 2019, pp. 145–149, doi:
Springer Berlin Heidelberg, 2005, pp. 554–564. 10.1109/SICN47020.2019.9019357.
[4] D. Maulud and A. M. Abdulazeez, “A Review on Linear Regression [23] M. A. Sulaiman, “Evaluating Data Mining Classification Methods
Comprehensive in Machine Learning,” Journal of Applied Science Performance in Internet of Things Applications,” Journal of Soft
and Technology Trends, vol. 1, no. 4, pp. 140–147, 2020. Computing and Data Mining, vol. 1, no. 2, pp. 11–25, 2020.
[5] G. Carleo et al., “Machine learning and the physical sciences,” [24] F. Yang, “An Extended Idea about Decision Trees,” in 2019
Reviews of Modern Physics, vol. 91, no. 4, p. 045002, 2019. International Conference on Computational Science and
[6] T. Hillel, M. Bierlaire, M. Elshafie, and Y. Jin, “A systematic review Computational Intelligence (CSCI), Dec. 2019, pp. 349–354, doi:
of machine learning classification methodologies for modelling 10.1109/CSCI49370.2019.00068.
passenger mode choice,” Journal of Choice Modelling, p. 100221, [25] J. Liang, Z. Qin, S. Xiao, L. Ou, and X. Lin, “Efficient and secure
2020. decision tree classification for cloud-assisted online diagnosis
[7] D. Zeebaree, H. Haron, A. Mohsin Abdulazeez, and D. Zebari, services,” IEEE Transactions on Dependable and Secure Computing,
Machine learning and Region Growing for Breast Cancer 2019.
Segmentation. 2019, p. 93. [26] A. Mohsin Abdulazeez, A. Brifcani, and Issa, “Intrusion Detection
[8] C. Feng, S. Wu, and N. Liu, “A user-centric machine learning and Attack Classifier Based on Three Techniques: A Comparative
framework for cyber security operations center,” in 2017 IEEE Study Intrusion Detection and Attack Classifier Based on Three
International Conference on Intelligence and Security Informatics Techniques: A Comparative Study 387,” Jan. 2021.
(ISI), Beijing, China, Jul. 2017, pp. 173–175, doi: [27] A. S. Eesa, Z. Orman, and A. M. A. Brifcani, “A novel feature-
10.1109/ISI.2017.8004902. selection approach based on the cuttlefish optimization algorithm for
[9] S. B. Kotsiantis, I. Zaharakis, and P. Pintelas, “Supervised machine intrusion detection systems,” Expert Systems with Applications, vol.
learning: A review of classification techniques,” Emerging artificial 42, no. 5, pp. 2670–2679, Apr. 2015, doi:
intelligence applications in computer engineering, vol. 160, no. 1, pp. 10.1016/j.eswa.2014.11.009.
3–24, 2007. [28] A. Shamim, H. Hussain, and Maqbool Uddin Shaikh, “A framework
[10] S. B. Kotsiantis, I. D. Zaharakis, and P. E. Pintelas, “Machine for generation of rules from decision tree and decision table,” in 2010
learning: a review of classification and combining techniques,” Artif International Conference on Information and Emerging Technologies,
Intell Rev, vol. 26, no. 3, pp. 159–190, Nov. 2006, doi: Jun. 2010, pp. 1–6, doi: 10.1109/ICIET.2010.5625700.
10.1007/s10462-007-9052-3. [29] A. Suresh, R. Udendhran, and M. Balamurgan, “Hybridized neural
[11] C. Surv, M. N. Murty, P. J. Flynn, A. K. Jain, and P. J. Flynn, And. network and decision tree based classifier for prognostic decision
1999. making in breast cancers,” Soft Computing, vol. 24, no. 11, pp. 7947–
7953, 2020.

26
Tijo & Abdulazeez / Journal of Applied Science and Technology Trends Vol. 02, No. 01, pp. 20 – 28 (2021)

[30] Priyanka and D. Kumar, “Decision tree classifier: a detailed survey,” [51] Y.-Y. Song and Y. Lu, “Decision tree methods: applications for
International Journal of Information and Decision Sciences, vol. 12, classification and prediction,” Shanghai archives of psychiatry, vol.
no. 3, pp. 246–269, 2020. 27, pp. 130–5, Apr. 2015, doi: 10.11919/j.issn.1002-0829.215044.
[31] A. S. Eesa, A. M. Abdulazeez, and Z. Orman, “A DIDS Based on The [52] RekhaMolala, “Entropy, Information gain and Gini Index; the crux of
Combination of Cuttlefish Algorithm and Decision Tree,” Science a Decision Tree,” Medium, Mar. 23, 2020.
Journal of University of Zakho, vol. 5, no. 4, pp. 313–318, 2017. https://blog.clairvoyantsoft.com/entropy-information-gain-and-gini-
[32] R. Kumar and R. Verma, “Classification algorithms for data mining: index-the-crux-of-a-decision-tree-99d0cdc699f4 (accessed Dec. 28,
A survey,” International Journal of Innovations in Engineering and 2020).
Technology (IJIET), vol. 1, no. 2, pp. 7–14, 2012. [53] V. Cheushev, D. A. Simovici, V. Shmerko, and S. Yanushkevich,
[33] S. S. Nikam, “A comparative study of classification techniques in data “Functional entropy and decision trees,” in Proceedings. 1998 28th
mining algorithms,” Oriental journal of computer science & IEEE International Symposium on Multiple-Valued Logic (Cat. No.
technology, vol. 8, no. 1, pp. 13–19, 2015. 98CB36138), 1998, pp. 257–262.
[34] C. Z. Janikow, “Fuzzy decision trees: issues and methods,” IEEE [54] X. Chen, Z. Yang, and W. Lou, “Fault Diagnosis of Rolling Bearing
Transactions on Systems, Man, and Cybernetics, Part B Based on the Permutation Entropy of VMD and Decision Tree,” in
(Cybernetics), vol. 28, no. 1, pp. 1–14, 1998. 2019 3rd International Conference on Electronic Information
[35] G. Stein, B. Chen, A. S. Wu, and K. A. Hua, “Decision tree classifier Technology and Computer Engineering (EITCE), Xiamen, China, Oct.
for network intrusion detection with GA-based feature selection,” in 2019, pp. 1911–1915, doi: 10.1109/EITCE47263.2019.9095187.
Proceedings of the 43rd annual Southeast regional conference- [55] C. Shang, M. Li, S. Feng, Q. Jiang, and J. Fan, “Feature selection via
Volume 2, 2005, pp. 136–141. maximizing global information gain for text classification,”
[36] I. S. Damanik, A. P. Windarto, A. Wanto, S. R. Andani, and W. Knowledge-Based Systems, vol. 54, pp. 298–309, Dec. 2013, doi:
Saputra, “Decision Tree Optimization in C4. 5 Algorithm Using 10.1016/j.knosys.2013.09.019.
Genetic Algorithm,” in Journal of Physics: Conference Series, 2019, [56] T. Maszczyk and W. Duch, “Comparison of Shannon, Renyi and
vol. 1255, no. 1, p. 012012. Tsallis entropy used in decision trees,” in International Conference on
[37] R. Barros, M. Basgalupp, A. de Carvalho, and A. Freitas, “A Survey Artificial Intelligence and Soft Computing, 2008, pp. 643–651.
of Evolutionary Algorithms for Decision-Tree Induction,” IEEE [57] L. E. Raileanu and K. Stoffel, “Theoretical Comparison between the
Transactions on Systems, Man, and Cybernetics, Part C: Applications Gini Index and Information Gain Criteria,” Annals of Mathematics
and Reviews, vol. 42, pp. 291–312, Jan. 2012, doi: and Artificial Intelligence, vol. 41, no. 1, pp. 77–93, May 2004, doi:
10.1109/TSMCC.2011.2157494. 10.1023/B:AMAI.0000018580.96245.c6.
[38] G. Gupta, “A self explanatory review of decision tree classifiers,” in [58] Y. Liu, L. Hu, F. Yan, and B. Zhang, “Information Gain with Weight
International conference on recent advances and innovations in Based Decision Tree for the Employment Forecasting of
engineering (ICRAIE-2014), 2014, pp. 1–7. Undergraduates,” in 2013 IEEE International Conference on Green
[39] S. S. Gavankar and S. D. Sawarkar, “Eager decision tree,” in 2017 Computing and Communications and IEEE Internet of Things and
2nd International Conference for Convergence in Technology (I2CT), IEEE Cyber, Physical and Social Computing, Beijing, China, Aug.
Mumbai, Apr. 2017, pp. 837–840, doi: 10.1109/I2CT.2017.8226246. 2013, pp. 2210–2213, doi: 10.1109/GreenCom-iThings-
[40] P. H. Swain and H. Hauska, “The decision tree classifier: Design and CPSCom.2013.417.
potential,” IEEE Transactions on Geoscience Electronics, vol. 15, no. [59] R. L. De Mántaras, “A distance-based attribute selection measure for
3, pp. 142–147, 1977. decision tree induction,” Machine learning, vol. 6, no. 1, pp. 81–92,
[41] A. Dey, “Machine learning algorithms: a review,” International 1991.
Journal of Computer Science and Information Technologies, vol. 7, [60] S. Taneja, C. Gupta, K. Goyal, and D. Gureja, “An enhanced k-nearest
no. 3, pp. 1174–1179, 2016. neighbor algorithm using information gain and clustering,” in 2014
[42] J. Mrva, Š. Neupauer, L. Hudec, J. Ševcech, and P. Kapec, “Decision Fourth International Conference on Advanced Computing &
Support in Medical Data Using 3D Decision Tree Visualisation,” in Communication Technologies, 2014, pp. 325–329.
2019 E-Health and Bioengineering Conference (EHB), Nov. 2019, pp. [61] Y. Zhao and Y. Zhang, “Comparison of decision tree methods for
1–4, doi: 10.1109/EHB47216.2019.8969926. finding active objects,” Advances in Space Research, vol. 41, no. 12,
[43] Y. Bengio, O. Delalleau, and C. Simard, “DECISION TREES DO pp. 1955–1959, 2008.
NOT GENERALIZE TO NEW VARIATIONS,” COMPUTATIONAL [62] K. Mittal, D. Khanduja, and P. C. Tewari, “An insight into ‘Decision
INTELLIGENCE, p. 19. Tree Analysis’”,” World Wide Journal of Multidisciplinary Research
[44] C. E. Brodley and P. E. Utgoff, “Multivariate decision trees,” and Development, vol. 3, no. 12, pp. 111–115, 2017.
Machine learning, vol. 19, no. 1, pp. 45–77, 1995. [63] Priyanka and D. Kumar, “Decision tree classifier: a detailed survey,”
[45] G. K. F. Tso and K. K. W. Yau, “Predicting electricity energy International Journal of Information and Decision Sciences, vol. 12,
consumption: A comparison of regression analysis, decision tree and no. 3, pp. 246–269, 2020.
neural networks,” Energy, vol. 32, no. 9, pp. 1761–1768, Sep. 2007, [64] Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, and H. Tang, “Predicting
doi: 10.1016/j.energy.2006.11.010. diabetes mellitus with machine learning techniques,” Frontiers in
[46] S. Singh and P. Gupta, “Comparative study ID3, cart and C4. 5 genetics, vol. 9, p. 515, 2018.
decision tree algorithm: a survey,” International Journal of Advanced [65] T. A. Assegie and P. S. Nair, “Handwritten digits recognition with
Information Science and Technology (IJAIST), vol. 27, no. 27, pp. 97– decision tree classification: a machine learning approach,”
103, 2014. International Journal of Electrical and Computer Engineering, vol. 9,
[47] L. Rokach and O. Maimon, “Top-Down Induction of Decision Trees no. 5, p. 4446, 2019.
Classifiers—A Survey,” Systems, Man, and Cybernetics, Part C: [66] F. De Felice et al., “Decision tree algorithm in locally advanced rectal
Applications and Reviews, IEEE Transactions on, vol. 35, pp. 476– cancer: an example of over-interpretation and misuse of a machine
487, Dec. 2005, doi: 10.1109/TSMCC.2004.843247. learning approach,” Journal of Cancer Research and Clinical
[48] T.-S. Lim, W.-Y. Loh, and Y.-S. Shih, “A comparison of prediction Oncology, vol. 146, no. 3, pp. 761–765, 2020.
accuracy, complexity, and training time of thirty-three old and new [67] I. H. Sarker, A. Colman, J. Han, A. I. Khan, Y. B. Abushark, and K.
classification algorithms,” Machine learning, vol. 40, no. 3, pp. 203– Salah, “Behavdt: a behavioral decision tree learning to build user-
228, 2000. centric context-aware predictive model,” Mobile Networks and
[49] W.-Y. Loh, “Fifty Years of Classification and Regression Trees,” Applications, vol. 25, no. 3, pp. 1151–1161, 2020.
International Statistical Review, vol. 82, Jun. 2014, doi: [68] X. Hu, C. Rudin, and M. Seltzer, “Optimal sparse decision trees,” in
10.1111/insr.12016. Advances in Neural Information Processing Systems, 2019, pp. 7267–
[50] S. R. Jiao, J. Song, and B. Liu, “A Review of Decision Tree 7275.
Classification Algorithms for Continuous Variables,” in Journal of [69] S. Patil and U. Kulkarni, “Accuracy Prediction for Distributed
Physics: Conference Series, 2020, vol. 1651, no. 1, p. 012083. Decision Tree using Machine Learning approach,” in 2019 3rd
International Conference on Trends in Electronics and Informatics

27
Tijo & Abdulazeez / Journal of Applied Science and Technology Trends Vol. 02, No. 01, pp. 20 – 28 (2021)

(ICOEI), Apr. 2019, pp. 1365–1371, doi: Technology and Engineering (ic-ETITE), Feb. 2020, pp. 1–4, doi:
10.1109/ICOEI.2019.8862580. 10.1109/ic-ETITE47903.2020.312.
[70] D. Hussain, M. A. Al-Antari, M. A. Al-Masni, S.-M. Han, and T.-S. [77] A. I. Taloba and S. S. I. Ismail, “An Intelligent Hybrid Technique of
Kim, “Femur segmentation in DXA imaging using a machine learning Decision Tree and Genetic Algorithm for E-Mail Spam Detection,” in
decision tree,” Journal of X-ray Science and Technology, vol. 26, no. 2019 Ninth International Conference on Intelligent Computing and
5, pp. 727–746, 2018. Information Systems (ICICIS), Dec. 2019, pp. 99–104, doi:
[71] N. Linty, A. Farasin, A. Favenza, and F. Dovis, “Detection of GNSS 10.1109/ICICIS46948.2019.9014756.
Ionospheric Scintillations Based on Machine Learning Decision [78] M. O. Arowolo, M. Adebiyi, A. Adebiyi, and O. Okesola, “PCA
Tree,” IEEE Transactions on Aerospace and Electronic Systems, vol. Model For RNA-Seq Malaria Vector Data Classification Using KNN
55, no. 1, pp. 303–317, Feb. 2019, doi: 10.1109/TAES.2018.2850385. And Decision Tree Algorithm,” in 2020 International Conference in
[72] W. Kuang, Y. Chan, S. Tsang, and W. Siu, “Machine Learning-Based Mathematics, Computer Engineering and Computer Science
Fast Intra Mode Decision for HEVC Screen Content Coding via (ICMCECS), Mar. 2020, pp. 1–8, doi:
Decision Trees,” IEEE Transactions on Circuits and Systems for 10.1109/ICMCECS47690.2020.240881.
Video Technology, vol. 30, no. 5, pp. 1481–1496, May 2020, doi: [79] S. Pathan, P. Kumar, R. Pai, and S. V. Bhandary, “Automated
10.1109/TCSVT.2019.2903547. detection of optic disc contours in fundus images using decision tree
[73] I. Ramadhan, P. Sukarno, and M. A. Nugroho, “Comparative Analysis classifier,” Biocybernetics and Biomedical Engineering, vol. 40, no. 1,
of K-Nearest Neighbor and Decision Tree in Detecting Distributed pp. 52–64, 2020.
Denial of Service,” in 2020 8th International Conference on [80] A. A. Nagra et al., “Hybrid self-inertia weight adaptive particle swarm
Information and Communication Technology (ICoICT), Yogyakarta, optimisation with local search using C4. 5 decision tree classifier for
Indonesia, Jun. 2020, pp. 1–4, doi: feature selection problems,” Connection Science, vol. 32, no. 1, pp.
10.1109/ICoICT49345.2020.9166380. 16–36, 2020.
[74] V. M. E. Batitis, M. J. G. Caballes, A. A. Ciudad, M. D. Diaz, R. D. [81] A. Ahmim, L. Maglaras, M. A. Ferrag, M. Derdour, and H. Janicke,
Flores, and E. R. E. Tolentin, “Image Classification of Abnormal Red “A novel hierarchical intrusion detection system based on decision
Blood Cells Using Decision Tree Algorithm,” in 2020 Fourth tree and rules-based models,” in 2019 15th International Conference
International Conference on Computing Methodologies and on Distributed Computing in Sensor Systems (DCOSS), 2019, pp.
Communication (ICCMC), Mar. 2020, pp. 498–504, doi: 228–233.
10.1109/ICCMC48092.2020.ICCMC-00093. [82] M. Li, H. Xu, and Y. Deng, “Evidential decision tree based on belief
[75] Y. Zhang, J. Liu, Z. Zhang, and J. Huang, “Prediction of Daily entropy,” Entropy, vol. 21, no. 9, p. 897, 2019.
Smoking Behavior Based on Decision Tree Machine Learning [83] P. Sathiyanarayanan, S. Pavithra, M. S. SARANYA, and M.
Algorithm,” in 2019 IEEE 9th International Conference on Makeswari, “Identification of Breast Cancer Using The Decision Tree
Electronics Information and Emergency Communication (ICEIEC), Algorithm,” in 2019 IEEE International Conference on System,
Jul. 2019, pp. 330–333, doi: 10.1109/ICEIEC.2019.8784698. Computation, Automation and Networking (ICSCAN), 2019, pp. 1–6.
[76] S. Nandhini and J. M. K.S, “Performance Evaluation of Machine
Learning Algorithms for Email Spam Detection,” in 2020
International Conference on Emerging Trends in Information

View publication stats

2021 BÅ Lundvall - The Digital Innovation Race Conceptualizing The Emerging
No ratings yet
2021 BÅ Lundvall - The Digital Innovation Race Conceptualizing The Emerging
207 pages
Yale-Manual de Partes
100% (8)
Yale-Manual de Partes
150 pages
Classification Techniques in Machine Learning: Applications and Issues
No ratings yet
Classification Techniques in Machine Learning: Applications and Issues
8 pages
ClassificationTechniquesinMachineLearningApplicationsandIssues
No ratings yet
ClassificationTechniquesinMachineLearningApplicationsandIssues
8 pages
Literature Review CCSIT205
No ratings yet
Literature Review CCSIT205
9 pages
4. a Mechatronics System Based on Feature Selection and AI For
No ratings yet
4. a Mechatronics System Based on Feature Selection and AI For
13 pages
Agent Based Meta Learning in Distributed
No ratings yet
Agent Based Meta Learning in Distributed
7 pages
Methodology Mate Nu Paper
No ratings yet
Methodology Mate Nu Paper
7 pages
Decision_Trees_Concepts_Algorithms
No ratings yet
Decision_Trees_Concepts_Algorithms
15 pages
Operations Research Applications in The Field of Information and Communication Technologies
No ratings yet
Operations Research Applications in The Field of Information and Communication Technologies
10 pages
Text Mining and Its Applications
No ratings yet
Text Mining and Its Applications
5 pages
Genetic Algorithm Technique in Hybrid Intelligent Systems For Pattern Recognition
No ratings yet
Genetic Algorithm Technique in Hybrid Intelligent Systems For Pattern Recognition
8 pages
Approach To Textual Data Analysis
No ratings yet
Approach To Textual Data Analysis
11 pages
Feature Selection Based On Fuzzy Entropy
No ratings yet
Feature Selection Based On Fuzzy Entropy
5 pages
Ijermt Jan2019
No ratings yet
Ijermt Jan2019
9 pages
Role_of_Machine_Learning_in_Manufacturin
No ratings yet
Role_of_Machine_Learning_in_Manufacturin
9 pages
Selecting and designing appropriate databases for deep learning models
No ratings yet
Selecting and designing appropriate databases for deep learning models
8 pages
Deep Neural Networks and Tabular Data A Survey
No ratings yet
Deep Neural Networks and Tabular Data A Survey
21 pages
Kanksha2021_Chapter_SupervsedLearnngAlgorthmASu
No ratings yet
Kanksha2021_Chapter_SupervsedLearnngAlgorthmASu
9 pages
Data Mining Framework For Network Intrusion Detection Using Efficient Techniques
No ratings yet
Data Mining Framework For Network Intrusion Detection Using Efficient Techniques
6 pages
A Review of Pattern Recognition Techniques
No ratings yet
A Review of Pattern Recognition Techniques
4 pages
Hybrid Classifier Using Evolutionary and Non-Evolutionary Algorithm For Performance Enhancement in Data Mining
No ratings yet
Hybrid Classifier Using Evolutionary and Non-Evolutionary Algorithm For Performance Enhancement in Data Mining
6 pages
Generalized Flow Performance Analysis of Intrusion Detection Using Azure Machine Learning Classification
No ratings yet
Generalized Flow Performance Analysis of Intrusion Detection Using Azure Machine Learning Classification
6 pages
Paper 8675
No ratings yet
Paper 8675
6 pages
Construction of Near-Optimal Axis-Parallel Decision Trees Using a Differential-Evolution Based Approach
No ratings yet
Construction of Near-Optimal Axis-Parallel Decision Trees Using a Differential-Evolution Based Approach
16 pages
A Survey of Machine Learning Algorithms For Big Data Analytics
No ratings yet
A Survey of Machine Learning Algorithms For Big Data Analytics
4 pages
Pattern Recognition Techniques in AI
No ratings yet
Pattern Recognition Techniques in AI
6 pages
Recommendation With User Trust and Item Rating
No ratings yet
Recommendation With User Trust and Item Rating
7 pages
Pattern Recognition Tecniques
No ratings yet
Pattern Recognition Tecniques
6 pages
Application of Data Mining Techniques in Stock Markets
No ratings yet
Application of Data Mining Techniques in Stock Markets
10 pages
Data Mining System and Applications A Re
No ratings yet
Data Mining System and Applications A Re
13 pages
Data Analytics for Process Engineers Prediction Control and Optimization
No ratings yet
Data Analytics for Process Engineers Prediction Control and Optimization
3 pages
Article
No ratings yet
Article
8 pages
Automatic Construction of Decision Trees From Data: A Multi-Disciplinary Survey
No ratings yet
Automatic Construction of Decision Trees From Data: A Multi-Disciplinary Survey
49 pages
WEEK 4-5-Exploring Data Science Methods, Models, And Application
No ratings yet
WEEK 4-5-Exploring Data Science Methods, Models, And Application
18 pages
Researchpaperclassification IEEEprocedding 1
No ratings yet
Researchpaperclassification IEEEprocedding 1
7 pages
A Framework For Supervised Classification Performance Analysis
No ratings yet
A Framework For Supervised Classification Performance Analysis
13 pages
Knowledge-Based Systems: Alfonso Hernández Medrano
No ratings yet
Knowledge-Based Systems: Alfonso Hernández Medrano
10 pages
Dr.R.Praba-StudyonMLAlgorithms
No ratings yet
Dr.R.Praba-StudyonMLAlgorithms
7 pages
Classifying The Supervised Machine Learning and Comparing The Performances of The Algorithms
No ratings yet
Classifying The Supervised Machine Learning and Comparing The Performances of The Algorithms
17 pages
Almuqati+-+20191+AAP
No ratings yet
Almuqati+-+20191+AAP
7 pages
Manufacturing Execution System Specific Data Analyst
No ratings yet
Manufacturing Execution System Specific Data Analyst
16 pages
2021 - A deep multi-task representation learning method for time series classification and retrieval
No ratings yet
2021 - A deep multi-task representation learning method for time series classification and retrieval
16 pages
ML Classification1
No ratings yet
ML Classification1
12 pages
Vol 8 No 0103
No ratings yet
Vol 8 No 0103
5 pages
Data Mining Techniques and Applications PDF
No ratings yet
Data Mining Techniques and Applications PDF
5 pages
Statistical Considerations On The K - Means Algorithm
No ratings yet
Statistical Considerations On The K - Means Algorithm
9 pages
WordEmbeddingMethodsofTextProcessing (1)
No ratings yet
WordEmbeddingMethodsofTextProcessing (1)
7 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Lecture Notes 1.1 & 1.2
No ratings yet
Lecture Notes 1.1 & 1.2
8 pages
DomainATM--Domain-adaptation-toolbox-for-medical-data-analysi_2023_NeuroImag
No ratings yet
DomainATM--Domain-adaptation-toolbox-for-medical-data-analysi_2023_NeuroImag
12 pages
827b551be7606030c4c1ca693fb54a0ed875
No ratings yet
827b551be7606030c4c1ca693fb54a0ed875
12 pages
Clustering Categorical Data: A Survey: International Journal of Information Technology & Decision Making December 2019
No ratings yet
Clustering Categorical Data: A Survey: International Journal of Information Technology & Decision Making December 2019
43 pages
Handling Missing Value in Decision Tree Algorithm PDF
No ratings yet
Handling Missing Value in Decision Tree Algorithm PDF
6 pages
Conference Proceeding
No ratings yet
Conference Proceeding
11 pages
Evaluation_of_Student_Academic_Performan
No ratings yet
Evaluation_of_Student_Academic_Performan
7 pages
2
No ratings yet
2
5 pages
A Survey of Classification Techniques in The Area of Big Data
No ratings yet
A Survey of Classification Techniques in The Area of Big Data
7 pages
Data Mining New Notes Unit 3 PDF
No ratings yet
Data Mining New Notes Unit 3 PDF
12 pages
Pattern Recognition: Fundamentals and Applications
From Everand
Pattern Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
Mivar NETs and logical inference with the linear complexity
From Everand
Mivar NETs and logical inference with the linear complexity
Varlamov, Oleg O.
No ratings yet
Full Text The Cons and Pros Telephone To Scool Students
No ratings yet
Full Text The Cons and Pros Telephone To Scool Students
3 pages
5G RAN2.0 Beam Management
100% (1)
5G RAN2.0 Beam Management
42 pages
01 - UTEC - Product Catalogue
100% (1)
01 - UTEC - Product Catalogue
58 pages
A New Approach in Energy Consumption Based on Genetic Algorithm and Fuzzy Logic for WSN
No ratings yet
A New Approach in Energy Consumption Based on Genetic Algorithm and Fuzzy Logic for WSN
1,239 pages
AVK Manuel
No ratings yet
AVK Manuel
242 pages
Wire Lengths For 9-1 Ununs
No ratings yet
Wire Lengths For 9-1 Ununs
2 pages
Sizing and Terminal Configuration: Heavy-Duty Commercial Batteries Groups (12-Volt)
No ratings yet
Sizing and Terminal Configuration: Heavy-Duty Commercial Batteries Groups (12-Volt)
1 page
PMT 10103
No ratings yet
PMT 10103
9 pages
DHT11
No ratings yet
DHT11
9 pages
Sikh PDF
No ratings yet
Sikh PDF
8 pages
Flexi MultiRadio Preparation Exercise
100% (1)
Flexi MultiRadio Preparation Exercise
15 pages
Problem Statements - Hackathon
No ratings yet
Problem Statements - Hackathon
5 pages
Technical Service Guide Monogram Bottom Mount Inverter
No ratings yet
Technical Service Guide Monogram Bottom Mount Inverter
51 pages
Okpac: Single Phase Solid State Relays
No ratings yet
Okpac: Single Phase Solid State Relays
2 pages
Empowerment Technology: 1. How Mail Merge Works
No ratings yet
Empowerment Technology: 1. How Mail Merge Works
9 pages
National Building Code of India 2016: 2-Day National Workshop On
No ratings yet
National Building Code of India 2016: 2-Day National Workshop On
6 pages
Machines: Design and Prototyping of Miniaturized Straight Bevel Gears For Biomedical Applications
No ratings yet
Machines: Design and Prototyping of Miniaturized Straight Bevel Gears For Biomedical Applications
16 pages
Siprotec 5 Configuration Jun 5, 2020 3:08 PM: Note On Function-Points Class
No ratings yet
Siprotec 5 Configuration Jun 5, 2020 3:08 PM: Note On Function-Points Class
7 pages
mp6 Software Datasheet
No ratings yet
mp6 Software Datasheet
21 pages
Trusha__ppt
No ratings yet
Trusha__ppt
20 pages
Team Nimbus 2000 Propellair Report
No ratings yet
Team Nimbus 2000 Propellair Report
26 pages
Electrica QC Check List
100% (1)
Electrica QC Check List
21 pages
Nanostructures As Single Electron Transistor
No ratings yet
Nanostructures As Single Electron Transistor
16 pages
Snowmen Disappear in Florida
No ratings yet
Snowmen Disappear in Florida
4 pages
Ajp Chapter 2
No ratings yet
Ajp Chapter 2
18 pages
Multimetru Auto, Model ADD51
No ratings yet
Multimetru Auto, Model ADD51
2 pages
SKF BB1-3446 Specification
No ratings yet
SKF BB1-3446 Specification
2 pages
Doosan Engine: Ratings (KWM/PS)
No ratings yet
Doosan Engine: Ratings (KWM/PS)
4 pages