UBC - 2023 - november - mohammadiun - saeed AI مهم جداً6
UBC - 2023 - november - mohammadiun - saeed AI مهم جداً6
widely used in data mining and two published cancer datasets (Dua and Graff, 2019; Ludwig et
al., 2015). The obtained prediction accuracies of normal and pruned FDTs, reported in Table 4-4,
were desirable indicating the promising performance of developed decision trees in the
classification problems. The table provides the maximum amounts of training and testing
accuracy, number of rules, and computation time values obtained from 10,000 iterations in each
dataset. It also tabulated the number of cases, attributes, and classes in each dataset. A detailed
description of the datasets can be found in the linked references in the table.
Table 4-4. Results of the simple FDT-based tool applied to multiple datasets
Max.
Dataset Max. Test(b) Average(c) Time(d) Case Attribute Class
Train(a)
Results of applying the proposed structure for FDT-based OSRM selection tool to the hypothetical
69
4.3.1 Visualization of Decision Trees used in OSRM Selection
The unpruned FDT-AP1 with the highest prediction accuracy on the training data and its pruned
version are shown in Figure 4-5. The unpruned FDT-AP1 (Figure 4-5a) contains 19 attribute
nodes, whereas the highest information gain can be obtained at A2 regarding response method
selection. The FDT determines that ISB should be applied to a spill case where the oil slick
thickness is at the “H” level. After A2, the FDT branches at A6, indicating that MCR is suitable
only when the remoteness of response location is “L”. A total of 37 terminal nodes as leaves are
linked to the attribute nodes at six hierarchies, resulting in a total of 37 “if-then” fuzzy rules. It
should be noted that in Figure 4-5a, several branches are connected to the same leaves for a
clearer presentation. Two terminal nodes are characterized by “no data” because of the limitation
of the given data. UCD is associated with the highest occurrence frequency (15 times) among the
terminal nodes, followed by MCR (13 times) and ISB (5 times), suggesting that UCD has relatively
less dependence on the selected attributes. However, UCD is greatly affected by A1 and A5. The
information discrimination power of attributes follows an order of A2, A6, A1, A3, and A4/A5.
After the pruning, the FDT (Figure 4-5b) exhibits a much simpler structure, in which only five
attribute nodes and 11 terminal nodes left. The hierarchy of attributes in terms of information gain
is similar to that of the unpruned tree, where the attributes are in the order of A2, A6, A1, and A3.
Attributes A4 and A5 are omitted from the pruned tree because the omissions do not increase the
error rate of the decision tree. The prediction performances of different FDTs are compared in
Section 4.3.2 Evaluation of Decision Trees. In the real-world application of the pruned tree, an
“if-then” rule example could be “if attributes A2, A6, A1, and A3 are found at their corresponding
70