0% found this document useful (0 votes)

61 views

Wa0002.

The document discusses exploring various tools in the Weka data mining software. It explains how to load and explore datasets, apply filters and classifiers, visualize data, and perform clustering. Specific steps are provided to load weather data and iris data, apply a decision tree classifier, visualize errors, and use k-means clustering.

Uploaded by

thabeswar2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

61 views

Wa0002.

Uploaded by

thabeswar2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

EX:NO:1

EXPLORING A WEKA TOOL

INTRODUCTION:

Invoke Weka from the Windows Start menu (on Linux or the Mac, double-click
weka.jar or weka.app, respectively). This starts up the Weka GUI Chooser.Click the
Explorer button to enter the Weka Explorer. The Preprocess panel opens up when
the Explorer interface is started. Click the open file option and starts perform the
respective operations, this can be shown below the figure.

THE PANELS:
1. PREPROCESS.
2. CLASSIFY.
3. CLUSTER.
4. ASSOCIATE.
5. SELECT ATTRIBUTE
6. VISUALIZE
PREPROCESS PANEL

LOADING THE DATA-SET:

Load the dataset from the data folder at the open file option, choose the
required dataset from the list of datasets. Here, for the experiment we chose
“Weather.nominal.arff” dataset and analyse the attributes from the dataset.

As the result shows, the weather data has 14 instances, and 5 attributes called
outlook, temperature, humidity, windy, and play. Click on the name of an attribute in the
left subpanel to see information about the selected attribute on the right, such as its
values and how many times an instance in the dataset has a particular value. This
information is also shown in the form of a histogram. All attributes in this dataset are
“nominal”— that is, they have a predefined finite set of values. The last attribute, play, is
the “class” attribute; its value can be yes or no.

DATA SET EDITOR:

It is possible to view and edit an entire dataset from within Weka. To do this, load
the weather.nominal.arff file again. Click the Edit button from the row of buttons at
the top of the Preprocess panel. This opens a new window called Viewer, which lists
all instances of the weather data.
APPLYING FILTER:

As you know, Weka “filters” can be used to modify datasets in a systematic fashion--
that is, they are data Preprocessing tools. Reload the weather.nominal dataset, and let’s
remove an attribute from it. The appropriate

filter is called Remove; its full name is:

weka.filters.unsupervised.attribute.Rem

Choosing attributes

THE VISUALISE PANEL:

Now take a look at Weka’s data visualization facilities. These work best with numeric
data, so we use the iris data. Load iris.arff, which contains the iris dataset containing
50 examples of three types of Iris: Iris setosa, Iris versicolor, and Iris virginica.

1.Click the Visualize tab to bring up the Visualize panel.

2.Click the first plot in the second row to open a window showing an enlarged plot
using the selected axes. Instances are shown as little crosses, the colour of which
depends on the instance’s class. The x-axis shows the sepal length attribute, and the
y-axis shows petal width.
Click on the visualise tab

Choose this plot.

3. Clicking on one of the crosses opens up an Instance Info window, which lists the
values of all attributes for the selected instance. Close the Instance Info window again.

Click the ‘x’

Info pops up

The selection fields at the top of the window containing the scatter plot determine
which attributes are used for the x- and y-axes. Change the x-axis to petalwidth and the
y-axis to petallength. The field showing Color: class (Num) can be used to change the
color coding.

Each of the barlike plots to the right of the scatter plot window represents a single
attribute. In each bar, instances are placed at the appropriate horizontal position and
scattered randomly in the vertical direction. Clicking a bar uses that attribute for the
x-axis of the scatter plot. Right-clicking a bar does the same for the y-axis. Use these
bars to change the x- and y-axes back to sepallength and petalwidth.

The Jitter slider displaces the cross for each instance randomly from its true position,
and can reveal situations where instances lie on top of one another.

Experiment a little by moving the slider.

The Select Instance button and the Reset, Clear, and Save buttons let you modify
the dataset. Certain instances can be selected and the others removed. Try the
Rectangle option: Select an area by left-clicking and dragging the mouse. The Reset
button changes into a Submit button. Click it, and all instances outside the rectangle
are deleted. You could use Save to save the modified dataset to a file. Reset restores
the original dataset.T

CLASSIFY PANEL:

Now we apply a classifier to the weather data. Load the weather data again. Go
to the Preprocess panel, click the Open file button, and select
“weather.nominal.arff” from the data directory. Then switch to the Classify panel by
clicking the Classify tab at the top of the window.

Choose the tab

USING THE C4.5 CLASSIFIER:

The C4.5 algorithm for building decision trees is implemented in Weka as a

classifier called J48. Select it by clicking the Choose button near the top of the Classify
tab. A dialog window appears showing various types of classifier. Click the trees entry
to reveal its subentries, and click J48 to choose that classifier. Classifiers, like filters,
are organized in a hierarchy: J48 has the full name weka.classifiers.trees.J48.

OUTPUT:
The outcome of training and testing appears in the Classifier Output box on the right.
Scroll through the text and examine it. First, look at the part that describes the decision
tree, reproduce in image below.

This represents the decision tree that was built, including the number of instances that
fall under each leaf.
The textual representation is clumsy to interpret, but Weka can generate an equivalent
graphical version.

Here’s how to get the graphical tree. Each time the Start button is pressed
and a new classifier is built and evaluated, a new entry appears in the Result List panel
in the lower left corner.
Click the start
Button

Confusion matrix shown.

BUILTING THE DECISION TREE:

Setting the Test Method:

When the Start button is pressed, the selected learning algorithm is run and the
dataset that was loaded in the Preprocess panel is used with the selected test protocol.

For example, in the case of tenfold cross-validation this involves running the learning
algorithm 10 times to build and evaluate 10 classifiers. A model built from the full training
set is then printed into the Classifier Output area: This may involve running the learning
algorithm one final time. The remainder of the output depends on the test protocol that
was chosen using test options.

VISUALISE THE ERRORS:

Right-click the trees.J48 entry in the result list and choose Visualize classifier
errors. A scatter plot window pops up. Instances that have been classified correctly are
marked by little crosses; ones that are incorrect are marked by little squares.
CLUSTER PANEL :

Clustering Data :
WEKA contains “clusterers” for finding groups of similar instances in a dataset. The
clustering schemes available in WEKA are,
✓ k-Means,
✓ EM,
✓ Cobweb,
✓ X-means,
✓ Farthest First.
Clusters can be visualized and compared to “true” clusters (if given). Evaluation is
based on log
likelihood if clustering scheme produces a probability distribution.
For this exercise we will use customer data that is contained in “customers.arff” file
and analyze it with “k-means” clustering scheme.
Steps:
(i) Select the file from WEKA
In ‘Preprocess’ window click on ‘Open file…’ button and select “weather.arff” file.
Click ‘Cluster’ tab at the top of WEKA Explorer window.
(ii) Choose the Cluster Scheme.
1.In the ‘Clusterer’ box click on ‘Choose’ button. In pull-down menu select WEKA
Clusterers, and select the cluster scheme ‘SimpleKMeans’. Some implementations of K-
means only allow numerical values for attributes; therefore, we do not need to use a filter.

2.Once the clustering algorithm is chosen, right-click on the

algorithm,“weak.gui.GenericObjectEditor” comes up to the screen.

.
GUI

3. Set the value in “numClusters” box to 5 (Instead of default 2) because you have five
clusters in your .arff file. Leave the value of ‘seed’ as is. The seed value is used in
generating a random number, which is used for making the initial assignment of
instances to clusters. Note that, in general, K-means is quite sensitive to how clusters
are initially assigned. Thus, it is often necessary to try different values and evaluate the
results.
(iii) Setting the test options.
1. Before you run the clustering algorithm, you need to choose ‘Cluster mode’.
2. Click on ‘Classes to cluster evaluation’ radio-button in ‘Cluster mode’ box and
select ‘Play’ in the pull-down box below. It means that you will compare how
well the chosen clusters match up with a pre-assigned class (‘‘Play’’) in
the data.
3. Once the options have been specified, you can run the clustering algorithm.
Click on the ‘Start’
button to execute the algorithm.

4. When training set is complete, the ‘Cluster’ output area on the right panel of
‘Cluster’
window is filled with text describing the results of training and testing. A new
entry appears in
the ‘Result list’ box on the left of the result. These behave just like their
classification counterparts.
CLUSTER OUTPUT

(iv)Analysing Results.
The clustering model shows the centroid of each cluster and statistics on the
number and percentage of instances assigned to different clusters. Cluster centroids
are the mean vectors foreach cluster; so, each dimension value and the centroid
represent the mean value for that dimension in the cluster.

(v) Visualisation of Results.

1.Another way of representation of results of clustering is through visualization.
2.Right-click on the entry in the ‘Result list’ and select ‘Visualize cluster
assignments’ in the pull-down
window. This brings up the ‘Weka Clusterer Visualize’ window.
3. On the ‘Weka Clusterer Visualize’ window, beneath the X-axis selector there
is a drop down list, ‘Colour’, for choosing the color scheme. This allows you to
choose the colour of points based on the attribute selected.
4.Below the plot area, there is a legend that describes what values the colours
correspond to. In your example, Seven different colours represent Seven numbers
(number of children). For better visibility you should change the colour of label ‘3’.

5.Left click on ‘3’ in the ‘Class colour’ box and select lighter color from the color
palette.
COLOUR PALATTE

6. You may want to save the resulting data set, which included each instance along
with its
assigned cluster. To do so, click ‘Save’ button in the visualization window and save
the result as
the file “weather_kmeans.arff”.
ASSOCIATION PANEL
(i)opening the file
1.Click ‘Associate’ tab at the top of ‘WEKA Explorer’ window. It brings up interface
for the
Apriori algorithm.
Opens up apriori algorithm.

2. The association rule scheme cannot handle numeric values; therefore, for this
exercise you will use grocery store data from the “weather.arff” file where all values
are nominal.Open “weather.arff” file.
(ii)setting the test-options

1. Right-click on the ‘Associator’ box, ‘GenericObjectEditor’ appears on your

screen. In the dialog
box, change the value in ‘minMetric’ to 0.4 for confidence = 40%. Make sure that the
default
value of rules is set to 100. The upper bound for minimum support
‘upperBoundMinSupport’
should be set to 1.0 (100%) and ‘lowerBoundMinSupport’ to 0.1. Apriori in WEKA
starts with the
upper bound support and incrementally decreases support (by delta increments,
which by
default is set to 0.05 or 5%).
4.The algorithm halts when either the specified number of rules is generated, or the
lower bound for minimum support is reached. The ‘significanceLevel’ testing option
is only applicable in the case of confidence and is (-1.0) by default (not used).
5. Once the options have been specified, you can run Apriori algorithm. Click on the
‘Start’ button
to execute the algorithm.

(iii)Analysing the results

=== Run information ===
Scheme: weka.associations.Apriori -N 10 -T 0 -C 0.4 -D 0.05 -U 1.0 -M 0.1 -S -1.0 -c -1
Relation: weather.symbolic
Instances: 14
Attributes: 5
outlook
temperature
humidity
windy
play

The results for Apriori algorithm are the following:

-> First, the program generated the sets of large itemsets found for each support
size considered. In this case five item sets of three items were found to have the
required minimum support.
->By default, Apriori tries to generate ten rules. It begins with a minimum support
of 100% of the data items and decreases this in steps of 5% until there are at least ten
rules with the required minimum confidence, or until the support has reached a lower
bound of 10% whichever occurs first. The minimum confidence is set 0.4 (40%).
->As you can see, the minimum support decreased to 0.3 (30%), before the required
number of rules
can be generated. Generation of the required number of rules involved a total of 14
iterations.
->The last part gives the association rules that are found. The number preceding =
=> symbol indicates the rule’s support, that is, the number of items covered by its
premise. Following the rule is the number of those items for which the rule’s
consequent holds as well. In the parentheses there is a
confidence of the rule.

Considerations For Decision Makers and Developers Toward The Adoption of Decentralized Key Management Systems Technology in Emerging Applications
No ratings yet
Considerations For Decision Makers and Developers Toward The Adoption of Decentralized Key Management Systems Technology in Emerging Applications
12 pages
Weka Tutorial
No ratings yet
Weka Tutorial
45 pages
ERD Airlines
100% (3)
ERD Airlines
24 pages
Home Designer Pro 2021 Reference Manual
100% (1)
Home Designer Pro 2021 Reference Manual
759 pages
Metercat Getting Started 2 1
No ratings yet
Metercat Getting Started 2 1
61 pages
Module 2: History of Computer: Basic Computing Periods Objectives
100% (1)
Module 2: History of Computer: Basic Computing Periods Objectives
9 pages
DATA WAREHOUSING -TO WRITE
No ratings yet
DATA WAREHOUSING -TO WRITE
23 pages
DMLB 1
No ratings yet
DMLB 1
3 pages
Lab04
No ratings yet
Lab04
7 pages
Weka Exercise 1
No ratings yet
Weka Exercise 1
7 pages
Weka Exercise 1
No ratings yet
Weka Exercise 1
7 pages
DWDM Lab 2
No ratings yet
DWDM Lab 2
3 pages
DMW lab Print
No ratings yet
DMW lab Print
21 pages
DHW Lab (Ex1 To 3)
No ratings yet
DHW Lab (Ex1 To 3)
18 pages
WEKA Manual
No ratings yet
WEKA Manual
25 pages
Weka (20030421-Version1 by Kdelab)
No ratings yet
Weka (20030421-Version1 by Kdelab)
51 pages
Data Warehousing Lab Manual
No ratings yet
Data Warehousing Lab Manual
36 pages
Weka Overview Slides
No ratings yet
Weka Overview Slides
31 pages
Exp 6
No ratings yet
Exp 6
12 pages
Expt 1 Docx
No ratings yet
Expt 1 Docx
15 pages
WEKA Manual
No ratings yet
WEKA Manual
11 pages
Data Mining Term Project Machine Learning With WEKA: Weka Explorer Tutorial For Version 3.4.3
No ratings yet
Data Mining Term Project Machine Learning With WEKA: Weka Explorer Tutorial For Version 3.4.3
42 pages
DWDM Lab File
No ratings yet
DWDM Lab File
29 pages
DM Lab Task-1 Expr's-1
No ratings yet
DM Lab Task-1 Expr's-1
58 pages
Wekappt
No ratings yet
Wekappt
58 pages
Data Warehousing and Data Mining Lab
No ratings yet
Data Warehousing and Data Mining Lab
53 pages
DWM1
No ratings yet
DWM1
19 pages
DM Lab
No ratings yet
DM Lab
101 pages
Data Mining Lab Manual
No ratings yet
Data Mining Lab Manual
50 pages
CCS341-DW LAB Manual - Chumma Chumma Practical Notes
No ratings yet
CCS341-DW LAB Manual - Chumma Chumma Practical Notes
89 pages
Weka Lab
No ratings yet
Weka Lab
11 pages
DWDM
No ratings yet
DWDM
6 pages
Dinesh DM
No ratings yet
Dinesh DM
34 pages
Data Warehousing Lab Exp 1-3
No ratings yet
Data Warehousing Lab Exp 1-3
24 pages
Lecture 12 - Weka Tutorial
No ratings yet
Lecture 12 - Weka Tutorial
84 pages
Weka Tutorial: 1. Downloading and Installing Weka (Version 3.6)
No ratings yet
Weka Tutorial: 1. Downloading and Installing Weka (Version 3.6)
4 pages
Weka Tutorial
No ratings yet
Weka Tutorial
32 pages
Data Warehousing Lab Excercise
No ratings yet
Data Warehousing Lab Excercise
45 pages
WEKA Explorer User Guide For Version 3-4: Richard Kirkby Eibe Frank July 15, 2008
No ratings yet
WEKA Explorer User Guide For Version 3-4: Richard Kirkby Eibe Frank July 15, 2008
13 pages
dwdm_file-final_ver3.pdf_20241230_172003_0000
No ratings yet
dwdm_file-final_ver3.pdf_20241230_172003_0000
54 pages
Using Weka
No ratings yet
Using Weka
6 pages
DWDM LAB MANUAL
No ratings yet
DWDM LAB MANUAL
55 pages
dw
No ratings yet
dw
12 pages
Lab Manual - DM
No ratings yet
Lab Manual - DM
56 pages
DWM1 Riya
No ratings yet
DWM1 Riya
16 pages
itdw
No ratings yet
itdw
44 pages
Data Mining - Session #1 - Unlocked
No ratings yet
Data Mining - Session #1 - Unlocked
22 pages
AI32 Guide To Weka PDF
No ratings yet
AI32 Guide To Weka PDF
6 pages
Data Warehousing Full
No ratings yet
Data Warehousing Full
41 pages
Unit-7 Tools of AI (April 9, 2024)
No ratings yet
Unit-7 Tools of AI (April 9, 2024)
88 pages
Weka-: Data Warehousing and Data Mining Lab Manual-Week 9
100% (1)
Weka-: Data Warehousing and Data Mining Lab Manual-Week 9
8 pages
dw9exp1(1)
No ratings yet
dw9exp1(1)
43 pages
CS-703 (B) Data Warehousing and Data Mining Lab
No ratings yet
CS-703 (B) Data Warehousing and Data Mining Lab
50 pages
WEKA Explorer Tutorial
No ratings yet
WEKA Explorer Tutorial
45 pages
DM Lab Material
No ratings yet
DM Lab Material
88 pages
DWDM - Case Study On Weka - Ceb624
No ratings yet
DWDM - Case Study On Weka - Ceb624
13 pages
Priyadarshini J. L. College of Engineering, Nagpur: Session 2022-23 Semester-V
No ratings yet
Priyadarshini J. L. College of Engineering, Nagpur: Session 2022-23 Semester-V
31 pages
WEKA Lab Manual
100% (1)
WEKA Lab Manual
107 pages
Anne_CCS341_DW_Students Record_1a_1b_2_Print (1)
No ratings yet
Anne_CCS341_DW_Students Record_1a_1b_2_Print (1)
63 pages
DW Lab
No ratings yet
DW Lab
85 pages
Perform Data Preprocessing Tasks Using Labor Data Set in WEKA
No ratings yet
Perform Data Preprocessing Tasks Using Labor Data Set in WEKA
6 pages
WEKA A Machine Learning Workbench for Data Mining
No ratings yet
WEKA A Machine Learning Workbench for Data Mining
11 pages
Excel for Scientists and Engineers
From Everand
Excel for Scientists and Engineers
Dr. Gerard Verschuuren
3.5/5 (2)
100 Excel Simulations: Using Excel to Model Risk, Investments, Genetics, Growth, Gambling and Monte Carlo Analysis
From Everand
100 Excel Simulations: Using Excel to Model Risk, Investments, Genetics, Growth, Gambling and Monte Carlo Analysis
Gerard M. Verschuuren
4.5/5 (5)
Excel Simulations
From Everand
Excel Simulations
Gerard M. Verschuuren
3.5/5 (2)
kri
No ratings yet
kri
18 pages
Process_Quality_Assurance_of_Artificial_Intelligence_in_Medical_Diagnosis
No ratings yet
Process_Quality_Assurance_of_Artificial_Intelligence_in_Medical_Diagnosis
8 pages
Enhanced_Feature_Extraction_in_Planar_Nuclear_Medicine_Using_Pixon_Minimum-Complexity_Image_Processing
No ratings yet
Enhanced_Feature_Extraction_in_Planar_Nuclear_Medicine_Using_Pixon_Minimum-Complexity_Image_Processing
5 pages
Egspec Int Conf 2024
No ratings yet
Egspec Int Conf 2024
2 pages
selfstudy(c programming)
No ratings yet
selfstudy(c programming)
11 pages
Compte Rendu: Réaliser Par
No ratings yet
Compte Rendu: Réaliser Par
8 pages
IEB Schools User Manual 2.1.0
No ratings yet
IEB Schools User Manual 2.1.0
22 pages
OROI_Internet_On_The_Go_karta_en
No ratings yet
OROI_Internet_On_The_Go_karta_en
10 pages
COS101
No ratings yet
COS101
45 pages
Leveraging Security Mask On 3DEXPERIENCE Platform: Best Practices
No ratings yet
Leveraging Security Mask On 3DEXPERIENCE Platform: Best Practices
38 pages
IOT Projects List 2021
No ratings yet
IOT Projects List 2021
4 pages
Cleverbot
No ratings yet
Cleverbot
3 pages
Assignment Solution 7
No ratings yet
Assignment Solution 7
10 pages
Top 10 Best Practices For BW: 1. Consider: Is BW The "Right" Platform For This Application?
No ratings yet
Top 10 Best Practices For BW: 1. Consider: Is BW The "Right" Platform For This Application?
6 pages
Questionnaire On Laptop: Personal Details
No ratings yet
Questionnaire On Laptop: Personal Details
6 pages
Uat Protea
No ratings yet
Uat Protea
5 pages
Peter Gooding: Custom Applications Manager
No ratings yet
Peter Gooding: Custom Applications Manager
33 pages
Security Zones
No ratings yet
Security Zones
9 pages
Aoc Le42a5720 - 61
No ratings yet
Aoc Le42a5720 - 61
75 pages
Ccna3 Vlan Trunk
No ratings yet
Ccna3 Vlan Trunk
5 pages
RRB JE Syllabus PDF 2019 - Junior Engineer Syllabus PDF - RRB 2018 PDF
No ratings yet
RRB JE Syllabus PDF 2019 - Junior Engineer Syllabus PDF - RRB 2018 PDF
8 pages
Ccnacisco PDF
No ratings yet
Ccnacisco PDF
481 pages
SJ-20100414142254-018-ZXG10 IBSC (V6.20.21) Base Station Controller Ground Parameter Reference
No ratings yet
SJ-20100414142254-018-ZXG10 IBSC (V6.20.21) Base Station Controller Ground Parameter Reference
395 pages
Verizon Iphone 13 Pro Max MILLER
No ratings yet
Verizon Iphone 13 Pro Max MILLER
1 page
Best Practices For Running VMware VSphere On ISCSI
No ratings yet
Best Practices For Running VMware VSphere On ISCSI
37 pages
AutoCAD ppt-1 - 1
100% (1)
AutoCAD ppt-1 - 1
46 pages
The Monolith Strikes Back: Why Istio Migrated From Microservices To A Monolithic Architecture
No ratings yet
The Monolith Strikes Back: Why Istio Migrated From Microservices To A Monolithic Architecture
6 pages
Coding Barang Baik
No ratings yet
Coding Barang Baik
8 pages
Biodata Form Philippines PDF
No ratings yet
Biodata Form Philippines PDF
3 pages

Wa0002.

Uploaded by

Wa0002.

Uploaded by

EX:NO:1

EXPLORING A WEKA TOOL

LOADING THE DATA-SET:

DATA SET EDITOR:

filter is called Remove; its full name is:

THE VISUALISE PANEL:

1.Click the Visualize tab to bring up the Visualize panel.

Choose this plot.

Click the ‘x’

Experiment a little by moving the slider.

Choose the tab

The C4.5 algorithm for building decision trees is implemented in Weka as a

Confusion matrix shown.

BUILTING THE DECISION TREE:

Setting the Test Method:

VISUALISE THE ERRORS:

2.Once the clustering algorithm is chosen, right-click on the

(v) Visualisation of Results.

1. Right-click on the ‘Associator’ box, ‘GenericObjectEditor’ appears on your

(iii)Analysing the results

The results for Apriori algorithm are the following:

You might also like