0% found this document useful (0 votes)

57 views13 pages

DWDM - Case Study On Weka - Ceb624

This document describes a case study using the WEKA data mining tool to perform preprocessing and visualization on a customer dataset. It discusses loading the dataset into ARFF format, exploring attributes and statistics, applying filters for attribute selection and data transformation, and visualizing the data using histograms and scatter plots. The key steps taken include loading and exploring the raw data, selecting relevant attributes, applying filters like numeric to nominal conversion and removing instances, and visualizing the preprocessed data.

Uploaded by

CEB524SreejitGNair

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views13 pages

DWDM - Case Study On Weka - Ceb624

Uploaded by

CEB524SreejitGNair

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

SREEJIT GOPINATH NAIR SIGN:

CEB624 DATE:
TECOMP B GRADE:

VI SEM PCE DEPT OF COMPUTER ENGINEERING Case study using WEKA

Aim:-
Demonstration of preprocessing on dataset Customer.arff includes creating an ARFF file
and reading it into WEKA, and using the WEKA Explorer.

Creating an ARFF file :-

Attribute-Relation File Format (ARFF) is a file format recognized by WEKA. An ARFF
file typically has a .arff extension and contains two sections – a Header section and a Data
section.
An example header on the standard Customer dataset looks like this:

@Relation Customer

@Attribute age{youth,middleage,senior}
@Attribute income{high,medium,low}
@Attribute student{yes,no}
@Attribute creditrating{fair,excellent}
@Attribute buyscomputer{yes,no}

@Data
youth,high,no,fair,no
youth,high,no,excellent,no
middleage,high,no,fair,yes
senior,medium,no,fair,yes
senior,low,yes,fair,yes
senior,low,yes,excellent,no
middleage,low,yes,excellent,yes
youth,medium,no,fair,no
youth,low,yes,fair,yes
senior,medium,yes,fair,yes
youth,medium,yes,excellent,yes
middleage,medium,no,excellent,yes
middleage,high,yes,fair,yes
senior,medium,no,excellent,no

Lines that begin with a % are comments. The @RELATION, @ATTRIBUTE and @DATA
declarations are case insensitive..
The WEKA Explorer
When Explorer Tab is opened, tabs are as follows:
1. Preprocess. Choose and modify the data being acted on.
2. Classify. Train and test learning schemes that classify or perform regression.
3. Cluster. Learn clusters for the data.
4. Associate. Learn association rules for the data.
5. Select attributes. Select the most relevant attributes in the data.
6. Visualize. View an interactive 2D plot of the data.

Preprocessing :
Step1:Loading the data by clicking on open button in preprocessing interface and
selecting the appropriate file.

Step2:Once the data is loaded, weka will recognize the attributes and during the scan of
the data weka will compute some basic strategies on each attribute. The left panel shows the
list of recognized attributes while the top panel indicates the names of the base relation or
table and the current working relation.

Step3:Clicking on an attribute in the left panel will show the basic statistics on the
attributes for the categorical(nominal) attributes the frequency of each attribute value is
shown, while for continuous(numeric) attributes we can obtain min, max, mean, standard
deviation and deviation etc.,

Step4:The visualization all in the right button panel in the form of cross-tabulation
across attributes.

Step5: Selecting or filtering attributes

Filter box is used to set up filters that are required. At the left of the Filter box is a
Choose button. Once a filter has been selected, its name and options are shown in the field
next to the Choose button. Clicking on this box brings up a GenericObjectEditor dialog box,
which lets to configure a filter. Once completed with the settings chosen, click OK to return
to the main Explorer window.
Now apply it to the data by pressing the Apply button at the right end of the Filter panel.
The Preprocess panel will then show the transformed data. The change can be undone using
the Undo button. Use the Edit button to view original data set before transformation and
transformed data in the dataset editor.

Step 6:Visualization:
Weka uses many ways to visualize the data. The main GUI will show a histogram for the
attribute distributions for a single selected attribute at a time, by default this is the class
attribute. Individual colors indicate individual classes. On moving mouse over the histogram
,it will show the ranges and how many samples fall in each range. The button VISUALIZE
ALL will bring up a screen showing all distribution at once.
There is also a tab called VISUALIZE. Clicking that will open the scatterplots for all
attribute pairs

Task-1
Describe chosen dataset and few of the attributes.
Example: Description of the Iris dataset
Title: Iris data
Number of Instances: 150
Number of Attributes : 5 (numeric)

Attribute description for Customer:

Attribute 1 :- sepallength

Attribute 2 :- sepalwidth

Attribute 3 :- petallength
Attribute 4 :- petalwidth

Attribute 5 :- class
Task-2
List all the categorical (or nominal) attributes
Attribute 1 : Sepallength
Attribute 2 : Sepalwidth
Attribute 3 : Petallength
Attribute 4 : Petalwidth
Attribute 5 : Class
Task-3
What attributes do you think might be crucial in making the assessment?
The following attributes are crucial: sepallength, sepalwidth, petallength, petalwidth
Measures to do this process are: InfoGain, Gini Index,Gain Ratio.
Using weka tool we can easily identify the attribute relevancy analysis.
Steps: weka explorer -> preprocess -> open file(credit-g.arff) -> select all -> click on select
attributes -> select attribute evaluator(Info Gain) and Search Method as Ranker -> Click on
“start”
Task-4
Perform filterations on the Customer dataset (any 2). Capture the snapshots of the
dataset before filteration, the GenericObjectEditor window of each filter and transformed
dataset after filteration. Describe each chosen filter within few lines. Few Filters
recommended are listed below : (Choose -> weka -> filters -> unsupervised -> attribute)

Before Filter
After Filter (Add Cluster)
1) Numeric to Nominal
2) MathExpression
InterQuartileRanges

Try the following Unsupervised Instance Filters.

(Choose -> weka -> filters -> unsupervised -> instance)

Instance --> RemoveMisclassified

Remove Percentage
Task-5
Display (using snapshots) any visualization technique for the dataset chosen
(Histogram or Scatter plot)

Histogram
CONCLUSION:
Henceforth experiment was successfully implemented. Thus helped to analyze data using
preprocessing and visualize the results.

Gauranga Das - The Art of Focus (2021, Penguin Random House India Private Limited) - Libgen - Li
67% (3)
Gauranga Das - The Art of Focus (2021, Penguin Random House India Private Limited) - Libgen - Li
253 pages
TF1600 Manual Rev0
No ratings yet
TF1600 Manual Rev0
18 pages
Data Mining Lab Manual
33% (3)
Data Mining Lab Manual
44 pages
Demonstration of Preprocessing On Dataset Student - Arff Aim: This Experiment Illustrates Some of The Basic Data Preprocessing Operations That Can Be
100% (1)
Demonstration of Preprocessing On Dataset Student - Arff Aim: This Experiment Illustrates Some of The Basic Data Preprocessing Operations That Can Be
4 pages
Data Mining Lab File
No ratings yet
Data Mining Lab File
20 pages
Weka-: Data Warehousing and Data Mining Lab Manual-Week 9
100% (1)
Weka-: Data Warehousing and Data Mining Lab Manual-Week 9
8 pages
DWDM Lab Manual
No ratings yet
DWDM Lab Manual
47 pages
Wekappt
No ratings yet
Wekappt
58 pages
Data Warehousing and Data Mining Lab
No ratings yet
Data Warehousing and Data Mining Lab
53 pages
Rintro Wekacomplete
No ratings yet
Rintro Wekacomplete
135 pages
Weka Lab
No ratings yet
Weka Lab
11 pages
5 MIS510 Weka NetDraw
No ratings yet
5 MIS510 Weka NetDraw
33 pages
Data-Mining-Lab-Manual Cs 703b
No ratings yet
Data-Mining-Lab-Manual Cs 703b
41 pages
DWDM Lab Manual Using Weka-For MIC
No ratings yet
DWDM Lab Manual Using Weka-For MIC
42 pages
1 Absolutism Vs Relavatism
No ratings yet
1 Absolutism Vs Relavatism
4 pages
Understanding The Self Course Outline
100% (1)
Understanding The Self Course Outline
2 pages
DWDM Record With Alignment
No ratings yet
DWDM Record With Alignment
69 pages
Data Mining Lab Manual: Aurora's PG College Moosarambagh Mca Department
No ratings yet
Data Mining Lab Manual: Aurora's PG College Moosarambagh Mca Department
42 pages
WEKA Manual
No ratings yet
WEKA Manual
25 pages
Paver Block Specification
No ratings yet
Paver Block Specification
8 pages
CS-703 (B) Data Warehousing and Data Mining Lab
No ratings yet
CS-703 (B) Data Warehousing and Data Mining Lab
50 pages
Task 0: Weka Introduction
No ratings yet
Task 0: Weka Introduction
11 pages
Data Mining Lab Manual
No ratings yet
Data Mining Lab Manual
40 pages
Experiment No: 01 Data Exploration & Data Preprocessing
No ratings yet
Experiment No: 01 Data Exploration & Data Preprocessing
54 pages
Municipal Corporation of Greater Mumbai
No ratings yet
Municipal Corporation of Greater Mumbai
95 pages
MC0717 Lab Manual
No ratings yet
MC0717 Lab Manual
42 pages
Using Weka
No ratings yet
Using Weka
6 pages
Weka (20030421-Version1 by Kdelab)
No ratings yet
Weka (20030421-Version1 by Kdelab)
51 pages
Perform Data Pre-Processing On Sample Data Set (Student - Arff)
No ratings yet
Perform Data Pre-Processing On Sample Data Set (Student - Arff)
4 pages
Weka Tutorial
No ratings yet
Weka Tutorial
32 pages
Weka Tutorial: 1. Downloading and Installing Weka (Version 3.6)
No ratings yet
Weka Tutorial: 1. Downloading and Installing Weka (Version 3.6)
4 pages
Specialized Crime Investigation: With Legal Medicine
100% (1)
Specialized Crime Investigation: With Legal Medicine
4 pages
Elephant Lifting Catalog v48
100% (1)
Elephant Lifting Catalog v48
80 pages
Data Mining - Lab - Manual
No ratings yet
Data Mining - Lab - Manual
20 pages
Perform Data Preprocessing Tasks Using Labor Data Set in WEKA
No ratings yet
Perform Data Preprocessing Tasks Using Labor Data Set in WEKA
6 pages
Wa0002.
No ratings yet
Wa0002.
21 pages
Rectus Tema
No ratings yet
Rectus Tema
486 pages
DM Lab Material
No ratings yet
DM Lab Material
88 pages
09 Kbat Jawapan
88% (8)
09 Kbat Jawapan
40 pages
Expt. No. 2 - Basic Operational Amplifier Circuit PDF
No ratings yet
Expt. No. 2 - Basic Operational Amplifier Circuit PDF
2 pages
DM Tools Sample-1
No ratings yet
DM Tools Sample-1
72 pages
DHW Lab (Ex1 To 3)
No ratings yet
DHW Lab (Ex1 To 3)
18 pages
FreemanWhite Hybrid Operating Room Design Guide PDF
No ratings yet
FreemanWhite Hybrid Operating Room Design Guide PDF
11 pages
Katalog Atk&toner
No ratings yet
Katalog Atk&toner
21 pages
AC2 Engineering Utilities 2 Syllabus
No ratings yet
AC2 Engineering Utilities 2 Syllabus
16 pages
Anne - CCS341 - DW - Students Record - 1a - 1b - 2 - Print
No ratings yet
Anne - CCS341 - DW - Students Record - 1a - 1b - 2 - Print
63 pages
Jurnal Manajemen Strategi Agribisnis Jessica Halaman 74 - 87
No ratings yet
Jurnal Manajemen Strategi Agribisnis Jessica Halaman 74 - 87
46 pages
DMW Lab Manual
No ratings yet
DMW Lab Manual
42 pages
Guiding Principle:: Title: Training Guide For Dcws On Self Help Assessment
No ratings yet
Guiding Principle:: Title: Training Guide For Dcws On Self Help Assessment
33 pages
Fórmulas Basicas de Derivadas e Integrales
No ratings yet
Fórmulas Basicas de Derivadas e Integrales
1 page
DWBI Lab Manual 2023-24 Final
No ratings yet
DWBI Lab Manual 2023-24 Final
40 pages
DWDM Lab Manual
No ratings yet
DWDM Lab Manual
55 pages
Nistgcr10 917 8 PDF
No ratings yet
Nistgcr10 917 8 PDF
268 pages
DMLab
No ratings yet
DMLab
27 pages
DWDM File-Final Ver3.pdf 20241230 172003 0000
No ratings yet
DWDM File-Final Ver3.pdf 20241230 172003 0000
54 pages
Data Mining File
No ratings yet
Data Mining File
87 pages
DM Lab 1
No ratings yet
DM Lab 1
6 pages
DMLB 1
No ratings yet
DMLB 1
3 pages
Forward and Inverse Modeling of Gravity Data
No ratings yet
Forward and Inverse Modeling of Gravity Data
14 pages
Data Mining and Warehousing
No ratings yet
Data Mining and Warehousing
30 pages
List of Government Colleges Affiliated To The University of Jammu (ACADEMIC SESSION 2020-21)
No ratings yet
List of Government Colleges Affiliated To The University of Jammu (ACADEMIC SESSION 2020-21)
9 pages
Filtration PDF
No ratings yet
Filtration PDF
13 pages
Weka LAB-ALL
No ratings yet
Weka LAB-ALL
19 pages
Dinesh DM
No ratings yet
Dinesh DM
34 pages
Contribution of Renewable Energy On Total Energy Capacity
No ratings yet
Contribution of Renewable Energy On Total Energy Capacity
6 pages
Literary Voice - March 2021
No ratings yet
Literary Voice - March 2021
372 pages
BI - Experiment - No - 1
No ratings yet
BI - Experiment - No - 1
7 pages
HSDL 3005 028
No ratings yet
HSDL 3005 028
28 pages
Liebert Psa5 500 1500va User Guide - 00
No ratings yet
Liebert Psa5 500 1500va User Guide - 00
26 pages
DMW LabFile 0901CS243D11 Swastik
No ratings yet
DMW LabFile 0901CS243D11 Swastik
25 pages
Hazop Ip
No ratings yet
Hazop Ip
117 pages
DMW Lab Print
No ratings yet
DMW Lab Print
21 pages
PCP Comprehensive Solutions
No ratings yet
PCP Comprehensive Solutions
8 pages
Lecture 12 - Weka Tutorial
No ratings yet
Lecture 12 - Weka Tutorial
84 pages
Astm C40 C40M 16
No ratings yet
Astm C40 C40M 16
1 page
Data Warehousing Lab Manual
No ratings yet
Data Warehousing Lab Manual
36 pages
Data Warehousing - To Write
No ratings yet
Data Warehousing - To Write
23 pages
Lecture - 11 SD Final
100% (1)
Lecture - 11 SD Final
26 pages
Lab Manual
No ratings yet
Lab Manual
16 pages
Data Warehousing Lab Excercise
No ratings yet
Data Warehousing Lab Excercise
45 pages
DW 9 Exp 1
No ratings yet
DW 9 Exp 1
43 pages
Quarter 2 - Matatag - SUMMATIVE TEST 1
No ratings yet
Quarter 2 - Matatag - SUMMATIVE TEST 1
3 pages
Itdw
No ratings yet
Itdw
44 pages
Data Mining Unit 5
No ratings yet
Data Mining Unit 5
12 pages
Data Mining Complete Lab Manual - DRSNR
No ratings yet
Data Mining Complete Lab Manual - DRSNR
27 pages

DWDM - Case Study On Weka - Ceb624

Uploaded by

DWDM - Case Study On Weka - Ceb624

Uploaded by

SREEJIT GOPINATH NAIR SIGN:

VI SEM PCE DEPT OF COMPUTER ENGINEERING Case study using WEKA

Creating an ARFF file :-

Step5: Selecting or filtering attributes

Attribute description for Customer:

Try the following Unsupervised Instance Filters.

Instance --> RemoveMisclassified

You might also like