0% found this document useful (0 votes)
34 views2 pages

UBC - 2023 - november - mohammadiun - saeed AI مهم جداً6

The document discusses the results of applying a decision tree-based tool to classify oil spill response methods using different datasets. It shows the structure of the decision trees produced for one dataset and compares the prediction performance of pruned and unpruned trees. The highest prediction accuracies were found using this decision tree approach.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views2 pages

UBC - 2023 - november - mohammadiun - saeed AI مهم جداً6

The document discusses the results of applying a decision tree-based tool to classify oil spill response methods using different datasets. It shows the structure of the decision trees produced for one dataset and compares the prediction performance of pruned and unpruned trees. The highest prediction accuracies were found using this decision tree approach.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

used classification datasets with a wide range of cases and attributes.

These include Iris Dataset

widely used in data mining and two published cancer datasets (Dua and Graff, 2019; Ludwig et

al., 2015). The obtained prediction accuracies of normal and pruned FDTs, reported in Table 4-4,

were desirable indicating the promising performance of developed decision trees in the

classification problems. The table provides the maximum amounts of training and testing

accuracy, number of rules, and computation time values obtained from 10,000 iterations in each

dataset. It also tabulated the number of cases, attributes, and classes in each dataset. A detailed

description of the datasets can be found in the linked references in the table.

Table 4-4. Results of the simple FDT-based tool applied to multiple datasets

Max.
Dataset Max. Test(b) Average(c) Time(d) Case Attribute Class
Train(a)

Normal 97, 74, 33(e) 85, 100, 35 89.1, 88.4, 34 0.012


Iris 150 4 3
Pruned 97, 74, 17 85, 100, 9 89.4, 89.2, 17 0.012

Normal 100, 64, 13 91, 100, 13 88.3, 75.9, 10 0.223


Colon
62 2000 2
Cancer
Pruned 98, 74, 7 91, 100, 11 86.2, 75.3, 5 0.224

Normal 100, 98, 9 99, 100, 9 98.7, 97.8, 13 3.094


Ovarian
253 15154 2
Cancer
Pruned 100, 98, 9 99, 100, 7 98.5, 96.9, 8 3.397
a
The maximum of training accuracy (%) obtained from 10,000 iterations with its corresponding values of testing
accuracy (%) and number of rules
b
The maximum of testing accuracy (%) obtained from 10,000 iterations with its corresponding values of training
accuracy (%) and number of rules
c
The average amounts of training (%), testing (%), and number of rules in 10,000 iterations
d
The average computation time (s) on 10,000 iterations
e
The first and second numbers are training and testing accuracy percentages, respectively, and the third one is the
number of rules

Results of applying the proposed structure for FDT-based OSRM selection tool to the hypothetical

oil spill dataset are provided and discussed in the following.

69
4.3.1 Visualization of Decision Trees used in OSRM Selection

The unpruned FDT-AP1 with the highest prediction accuracy on the training data and its pruned

version are shown in Figure 4-5. The unpruned FDT-AP1 (Figure 4-5a) contains 19 attribute

nodes, whereas the highest information gain can be obtained at A2 regarding response method

selection. The FDT determines that ISB should be applied to a spill case where the oil slick

thickness is at the “H” level. After A2, the FDT branches at A6, indicating that MCR is suitable

only when the remoteness of response location is “L”. A total of 37 terminal nodes as leaves are

linked to the attribute nodes at six hierarchies, resulting in a total of 37 “if-then” fuzzy rules. It

should be noted that in Figure 4-5a, several branches are connected to the same leaves for a

clearer presentation. Two terminal nodes are characterized by “no data” because of the limitation

of the given data. UCD is associated with the highest occurrence frequency (15 times) among the

terminal nodes, followed by MCR (13 times) and ISB (5 times), suggesting that UCD has relatively

less dependence on the selected attributes. However, UCD is greatly affected by A1 and A5. The

information discrimination power of attributes follows an order of A2, A6, A1, A3, and A4/A5.

After the pruning, the FDT (Figure 4-5b) exhibits a much simpler structure, in which only five

attribute nodes and 11 terminal nodes left. The hierarchy of attributes in terms of information gain

is similar to that of the unpruned tree, where the attributes are in the order of A2, A6, A1, and A3.

Attributes A4 and A5 are omitted from the pruned tree because the omissions do not increase the

error rate of the decision tree. The prediction performances of different FDTs are compared in

Section 4.3.2 Evaluation of Decision Trees. In the real-world application of the pruned tree, an

“if-then” rule example could be “if attributes A2, A6, A1, and A3 are found at their corresponding

moderate levels, then ISB could be a suitable response method.”

70

You might also like