0% found this document useful (0 votes)
47 views23 pages

Data Mining & Predictive Modeling Lab

Uploaded by

vg5520003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views23 pages

Data Mining & Predictive Modeling Lab

Uploaded by

vg5520003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

PANIPAT INSTITUTE OF ENGINEERING & TECHNOLOGY

(P.I.E.T)

Department of
B. TECH CSE- AI &DS

Practical File Record


Of
Data Mining And Predictive Modelling Lab
(PC-CS- AIDS- 417LA)

Submitted To: Submitted By:

Name of Faculty: Dr. BK Verma Name of Student: Vijay Gautam


Head of Department (HOD) Branch: B.Tech CSE- (AI & DS)
Semester: 7th
Student Id: 2821427

Session: 2024-25 (Odd Semester)


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

Assessment

Date of Practical
S.

Performance
No. Name of Practical

Total Marks
File Record
Attendance
Faculty

Viva (10)
Lab
(10)

(10)

(10)

(40)
Sign.

1. Create an Employee Table with the


help of Data Mining Tool WEKA.

2. Create a Weather Table with the


help of Data Mining Tool WEKA.

3. Apply Pre-Processing techniques to


the training data set of Weather
Table.

4. Apply Pre-Processing techniques to


the training data set of Employee
Table.

5 Normalize Weather Table data


using Knowledge Flow.
6. Normalize Employee Table data
using Knowledge Flow.
7. Write a program to demonstrate the
working of the decision tree based
ID3 algorithm. Use an appropriate
data set for building the decision
tree.
8. Write a program to implement the
naïve Bayesian classifier for a
sample training data set stored as a
.CSV file. Compute the accuracy of
the classifier, considering few test
data sets.

Department of B. TECH CSE-AI & DS Page 1 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

9 Assuming a set of documents that


need to be classified, use the naïve
Bayesian Classifier model to
perform this task. Built-in Java
classes/API can be used to write the
program. Calculate the accuracy,
precision, and recall for your data
set.

10 Write a program to construct a


Bayesian network considering
medical data. Use this model to
demonstrate the diagnosis of heart
patients using standard Heart
Disease Data Set. You can use
Java/Python ML library
classes/API.

Total Marks

Aggregated Marks

Department of B. TECH CSE-AI & DS Page 2 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

Practical No. 1
Aim: Create an Employee Table with the help of Data Mining Tool WEKA.

Description: We need to create an Employee Table with training data set which includes
attributes like, Empid , Name , Salary , Gender , Contactno .

Procedure:
Steps:
1) Open Start → Programs → Accessories → Notepad.

2) Type the following training data set with the help of Notepad for Employee Table.

@relation employee
@attribute Empid numeric
@attributeName{Vijay,Shubham,Rajni,Pardeep,Jyoti,Rahul,Pinki,Roshni,Rajat,
Raveena}
@attribute Salary numeric
@attribute Gender {Male , Female}
@attribute Contactno numeric

@data
427,Vijay ,10000,Male,97854327
402,Shubham,25000,Male,89065432
403,Rajni,23000,Female,54326788
404,Pardeep,12000,Male,97654245
405,Jyoti,15000,Female,87142677
406,Rahul,17000,Male,87906543
407,Pinki,18000,Female,64367880
408,Roshni,20000,Female,76541208
409,parvesh,30000,Male,54236789
410,Raveena,29000,Female,56783452

3) After that the file is saved with .arff file format.

4) Minimize the arff file and then open Start Programs weka-3-4.

5) Click on weka-3-4, then Weka dialog box is displayed on the screen.

6) In that dialog box there are four modes, click on explorer.

7) Explorer shows many options. In that click on 'open file' and select the arff file

8) Click on edit button which shows employee table on weka.

Department of B. TECH CSE-AI & DS Page 3 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

Output:

27 Vijay

Department of B. TECH CSE-AI & DS Page 4 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

Student Work Sheet


Practical No. 1
Aim: Create an Employee Table with the help of Data Mining Tool WEKA.
Objective: This practical focuses on creating an employee table that contains fields such as Employee
ID (Empid), Name, Salary, Gender, and Contact number. The dataset will be entered manually using a
text editor and saved in ARFF format. The WEKA tool will be used to load, visualize, and analyze the
data.
Requirements:
• Notepad (for creating the .arff file)
• WEKA software (version 3.4 or higher)
• Employee dataset

Output:

After completing the steps, the employee table should appear in the WEKA interface, showing columns
for Empid, Name, Salary, Gender, and Contactno.

Outcomes to be filled by student:


S. No. Outcomes
1 The student will understand how to create and save an ARFF file.
2 The student will learn how to load a dataset into WEKA.
3 The student will visualize data in the WEKA explorer.
4 The student will understand basic WEKA data manipulation.

Viva Questions:
Q1 : What is the purpose of the WEKA tool?
Ans : WEKA is used for data mining tasks, providing tools for data preprocessing,
classification, clustering, and visualization.
Q2 : What file format is used to load data into WEKA?
Ans : WEKA primarily uses the ARFF (Attribute-Relation File Format) for loading datasets.
Q3 : How would you define an attribute in an ARFF file?
Ans : Attributes are defined using the @attribute keyword followed by the attribute name and
type (numeric or nominal).
Q4 : Can WEKA handle both numeric and categorical data?
Ans : Yes, WEKA can handle both numeric and categorical (nominal) data types.

Marks/Grade....................... Signature of Faculty


Department of B. TECH CSE-AI & DS Page 5 Sakshi
PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

Practical No. 2

Aim: Create a Weather Table with the help of Data Mining Tool WEKA.

Description: We need to create a Weather table with training data set which includes attributes
like outlook, temperature, humidity, windy, play.
Procedure:
Steps:
1) Open Start → Programs → Accessories → Notepad.

2) Type the following training data set with the help of Notepad for Employee Table.

@relation weather
@attribute Outlook {sunny, overcast, rainy}
@attribute Temperature numeric
@attribute Humidity numeric
@attribute Windy {TRUE, FALSE}
@attribute Play {yes, no}

@data
sunny, 85, 85, FALSE, no
sunny, 80, 90, TRUE, no
overcast, 83, 78, FALSE, yes
rainy, 70, 96, FALSE, yes
rainy, 68, 80, FALSE, yes
rainy, 65, 70, TRUE, no
overcast, 64, 65, TRUE, yes
sunny, 72, 95, FALSE, no
sunny, 69, 70, FALSE, yes
rainy, 75, 80, FALSE, yes
sunny, 75, 70, TRUE, yes
overcast, 72, 90, TRUE, yes
overcast, 81, 75, FALSE, yes
rainy, 71, 91, TRUE, no

3) After that the file is saved with.arff file format.

4) Minimize the arff file and then open Start Programs weka-3-4.

5) Click on weka-3-4, then Weka dialog box is displayed on the screen.

6) In that dialog box there are four modes, click on explorer.

7) Explorer shows many options. In that click on 'open file' and select the arff file .

Department of B. TECH CSE-AI & DS Page 6 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

8) Click on edit button which shows weather table on weka.

Output :-

Department of B. TECH CSE-AI & DS Page 7 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

Student Work Sheet


Practical No. 2
Aim: Create a Weather Table with the help of Data Mining Tool WEKA
Objective: To learn how to create and visualize a simple weather dataset using WEKA.
Requirements:
• Notepad (for creating the .arff file)
• WEKA software (version 3.4 or higher)
• Weather dataset

Output:
After completing the steps, the weather table will appear in the WEKA interface, displaying the columns
for Outlook, Temperature, Humidity, Windy, and Play.
Outcomes to be filled by student:
S. No. Outcomes
1 The student will understand how to create and save an ARFF file.
2 The student will learn how to load a dataset into WEKA.
3 The student will visualize a weather dataset in the WEKA explorer.
4 The student will understand basic data manipulation using WEKA.

Viva Questions:
Q1 : What is the primary purpose of using WEKA for this experiment?
Ans : WEKA is used for data analysis, visualization, and performing various machine
learning tasks, such as classification and clustering.
Q2 : What is the ARFF file format?
Ans : ARFF (Attribute-Relation File Format) is a text format used by WEKA to represent
datasets. It includes attribute definitions and the actual data.
Q3 : What are the attributes used in this weather dataset?
Ans : The attributes used are Outlook, Temperature, Humidity, Windy, and Play.
Q4 : How does WEKA help in data mining and machine learning?
Ans : WEKA provides various tools for data preprocessing, classification, clustering,
association, and visualization, helping to apply machine learning algorithms to datasets.

Marks/Grade....................... Signature of Faculty

Department of B. TECH CSE-AI & DS Page 8 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

Practical No. 3
Aim: Apply Pre-Processing techniques to the training data set of Weather Table

Description: Real world databases are highly influenced to noise, missing and inconsistency
due to their queue size so the data can be pre-processed to improve the quality of data and
missing results and it also improves the efficiency.

There are 3 pre-processing techniques they are:


1) Add
2) Remove
3) Normalization

Creation of Weather Table:


Procedure:
1) Open Start → Programs → Accessories → Notepad
2) Type the following training data set with the help of Notepad for Weather Table.

@relation weather
@attribute Outlook {sunny, overcast, rainy}
@attribute Temperature numeric
@attribute Humidity numeric
@attribute Windy {TRUE, FALSE}
@attribute Play {yes, no}

@data
sunny, 85, 85, FALSE, no
sunny, 80, 90, TRUE, no
overcast, 83, 78, FALSE, yes
rainy, 70, 96, FALSE, yes
rainy, 68, 80, FALSE, yes
rainy, 65, 70, TRUE, no
overcast, 64, 65, TRUE, yes
sunny, 72, 95, FALSE, no
sunny, 69, 70, FALSE, yes
rainy, 75, 80, FALSE, yes
sunny, 75, 70, TRUE, yes
overcast, 72, 90, TRUE, yes
overcast, 81, 75, FALSE, yes
rainy, 71, 91, TRUE, no
3) After that the file is saved with .arff file format.
4) Minimize the .arff file and then open Start → Programs → WEKA-3.8.6
5) Click on WEKA-3.8.6, then WEKA dialog box is displayed on the screen.
6) In that dialog box, there are four modes, click on explorer.
7) Explorer shows many options. In that, click on 'open file' and select the .arff file.

Department of B. TECH CSE-AI & DS Page 9 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

8) Click on the edit button which shows the weather table on WEKA.

OUTPUT:

PRE-PROCESSING TECHNIQUES:
Add:-
Procedure:
1. Start → Programs → WEKA-3.8.6 → WEKA-3.8.6
2. Click on explorer
3. Click on open file
4. Select Weather .arff file and click on open.
5. Click on Choose button and select the Filters option.
6. In Filters, we have Supervised and Unsupervised data.
7. Click on Unsupervised data.
8. Select the attribute ADD →
9. A new window is opened.
10. In that, we enter attribute index, type, data format, nominal label values for Climate.
11. Click on OK.
12. Press the Apply button, then a new attribute is added to the Weather Table.
13. Save the file.
14. Click on the Edit button, it shows a new Weather Table on WEKA.

Department of B. TECH CSE-AI & DS Page 10 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

Weather Table after Adding new attribute CLIMATE (An updated viewer window with the Climate
attribute is added)

2. Remove:-

Procedure:
15. Start → Programs → WEKA-3.8.6 → WEKA-3.8.6
16. Click on explorer.
17. Click on open file.
18. Select Weather .arff file and click on open.
19. Click on Choose button and select the Filters option.
20. In Filters, we have Supervised and Unsupervised data.
21. Click on Unsupervised data.
22. Select the attribute Remove →
23. Remove attributes WINDY, PLAY →
24. Click Remove button and then Save.
25. Click on the Edit button, it shows a new Weather Table on WEKA

Department of B. TECH CSE-AI & DS Page 11 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

Weather Table after Removing attributes WINDY, PLAY (An updated viewer window without the
Windy and Play attributes)

3. Normalize:-

Procedure:
26. Start Programs WEKA-3.8.6 → WEKA-3.8.6
27. Click on explorer.
28. Click on open file.
29. Select Weather.arff file and click on open.
30. Click on Choose button and select the Filters option.
31. In Filters, we have Supervised and Unsupervised data.
32. Click on Unsupervised data.
33. Select the attribute Normalize
34. Select the attributes temparature humidity to Normalize.
35. Click on Apply button and then Save
36. Click on the Edit button, it shows a new Weather Table with normalized values on WEKA.

Department of B. TECH CSE-AI & DS Page 12 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

Weather Table after Normalizing attributes TEMPERATURE, HUMIDITY:

RESULTS: This program has been successfully executed by applying pre-processing techniques to
the training data set of Employee Table using WEKA tool.

Department of B. TECH CSE-AI & DS Page 13 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

Student Work Sheet


Practical No. 3
Aim: Apply Pre-Processing techniques to the training data set of Weather Table
Objective: To understand and apply different pre-processing techniques such as adding, removing, and
normalizing attributes in a dataset to improve data quality and consistency.
Requirements:
• Notepad (for creating the .arff file)
• WEKA software (version 3.8.6 or higher)
• Weather dataset

Output: After applying the pre-processing techniques, the Weather Table will be updated with the
following changes:
• Added Attribute: A new attribute (e.g., Climate) will appear in the table.
• Removed Attributes: The Windy and Play attributes will no longer be present.
• Normalized Attributes: The values for Temperature and Humidity will be normalized to a specified
range.
Outcomes to be filled by student:
S. No. Outcomes
1 The student will understand how to add attributes to a dataset using WEKA.
2 The student will learn how to remove unwanted attributes from a dataset using
WEKA.
3 The student will understand the concept and application of normalization in data pre-
processing.
4 The student will be able to apply basic data pre-processing techniques to improve
dataset quality.

Viva Questions:
Q1 : What is the purpose of data pre-processing in data mining?
Ans : Data pre-processing is essential to clean, transform, and prepare raw data, improving
its quality and ensuring better results during data analysis and machine learning.
Q2 : What are the different pre-processing techniques you applied in this experiment?
Ans : The techniques applied are Add (adding new attributes), Remove (removing unwanted
attributes), and Normalization (scaling numeric values to a standard range).
Q3 : What is the importance of normalization in data pre-processing?
Ans : Normalization helps scale numeric attributes to a uniform range, improving the
performance of machine learning algorithms, especially those sensitive to the range of
values, such as distance-based algorithms.

Department of B. TECH CSE-AI & DS Page 14 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

Q4 : What is the impact of removing attributes from a dataset?


Ans : Removing unnecessary or irrelevant attributes reduces the dimensionality of the
dataset, improving computational efficiency and potentially increasing the performance
of the learning algorithms.

Marks/Grade....................... Signature of Faculty

Department of B. TECH CSE-AI & DS Page 15 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

Practical No. 4
Aim: Apply Pre-Processing techniques to the training data set of Employee Table.

Description: Real world databases are highly influenced to noise, missing and inconsistency
due to their queue size so the data can be pre-processed to improve the quality of data and
missing results and it also improves the efficiency.

There are 3 pre-processing techniques they are:


1)Add
2) Remove
3)Normalization

Creation of Weather Table:


Procedure:
1) Open Start → Programs → Accessories → Notepad
2) Type the following training data set with the help of Notepad for Employee Table.

@relation employee
@attribute Empid numeric
@attribute
Name{Vijay,Shubham,Rajni,Pardeep,Jyoti,Rahul,Pinki,Roshni,parvesh,Raveen
a}
@attribute Salary numeric
@attribute Gender {Male , Female}
@attribute Contactno numeric

@data
427,Vijay,10000,Male,97854327
402,Shubham,25000,Male,89065432
403,Rajni,23000,Female,54326788
404,Pardeep,12000,Male,97654245
405,Jyoti,15000,Female,87142677
406,Rahul,17000,Male,87906543
407,Pinki,18000,Female,64367880
408,Roshni,20000,Female,76541208
409,Parvesh,30000,Male,54236789
410,Raveena,29000,Female,56783452

3) After that the file is saved with .arff file format.


4) Minimize the .arff file and then open Start → Programs → WEKA-3.8.6
5) Click on WEKA-3.8.6, then WEKA dialog box is displayed on the screen.
6) In that dialog box, there are four modes, click on explorer.
7) Explorer shows many options. In that, click on 'open file' and select the .arff file.
8) Click on the edit button which shows the weather table on WEKA.

Department of B. TECH CSE-AI & DS Page 16 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

OUTPUT:

27 Vijay

PRE-PROCESSING TECHNIQUES:
Add:-

Procedure:
1.Start → Programs → WEKA-3.8.6 → WEKA-3.8.6
2.Click on explorer
3.Click on open file
4.Select Employee.arff file and click on open.
5.Click on Choose button and select the Filters option.
6.In Filters, we have Supervised and Unsupervised data.
7.Click on Unsupervised data.
8.Click on OK.
9.Press the Apply button, then a new attribute is added to the Employee Table.
10.Save the file.
11.Click on the Edit button, it shows a new Employee Table on WEKA.

Department of B. TECH CSE-AI & DS Page 17 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

27 Vijay

Employee Table after Adding new attribute Address (An updated viewer window with the Employee
attribute added)

4. Remove:

Procedure:
15. Start → Programs → WEKA-3.8.6 → WEKA-3.8.6
16. Click on explorer.
17. Click on open file.
18. Select Employee.arff file and click on open.
19. Click on Choose button and select the Filters option.
20. In Filters, we have Supervised and Unsupervised data.
21. Click on Unsupervised data.
22. Select the attribute Remove →
23. Remove attributes Gender , Contactno →
24. Click Remove button and then Save.
25. Click on the Edit button, it shows a new Employee Table on WEKA.

Department of B. TECH CSE-AI & DS Page 18 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

27 Vijay

Employee Table after Removing attributes GENDER, CONTACTNO (An updated viewer window
without the Gender and Contactno attributes)

5. Normalize:-

Procedure:
26.Start Programs WEKA-3.8.6 → WEKA-3.8.6
27. Click on explorer.
28. Click on open file.
29. Select Employee.arff file and click on open.
30. Click on Choose button and select the Filters option.
31. In Filters, we have Supervised and Unsupervised data.
32. Click on Unsupervised data.
33. Select the attribute Normalize
34. Select the attributes Empid , Salary to Normalize.
35. Click on Apply button and then Save
36. Click on the Edit button, it shows a new Employee Table with normalized values on WEKA.

Department of B. TECH CSE-AI & DS Page 19 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

Employee Table after Normalizing attributes EMPID, SALARY:

Vijay

RESULTS: This program has been successfully executed by applying pre-processing techniques to
the training data set of Weather Table using WEKA tool.

Department of B. TECH CSE-AI & DS Page 20 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

Student Work Sheet


Practical No. 4
Aim: Apply Pre-Processing techniques to the training data set of Employee Table.
Objective: To understand and apply data pre-processing techniques such as adding, removing, and normalizing
attributes in a dataset to improve the quality of data and handle missing or inconsistent information.

Requirements:
• Notepad (for creating the .arff file)
• WEKA software (version 3.8.6 or higher)
• Employee dataset.

Output: After applying the pre-processing techniques, the Employee Table will be updated with the
following changes:
• Added Attribute: A new attribute (e.g., Address) will appear in the table.
• Removed Attributes: The Gender and Contactno attributes will no longer be present.
• Normalized Attributes: The values for Empid and Salary will be normalized to a specified range.
Outcomes to be filled by student:
S. No. Outcomes
1 Understand how to add attributes to a dataset using WEKA.
2 Learn how to remove unwanted attributes from a dataset.
3 Understand normalization and its impact on numeric attributes.
4 Be able to apply pre-processing techniques to improve data quality.

Viva Questions:
Q1 : What is the significance of applying pre-processing techniques to a dataset?
Ans : Pre-processing helps clean, transform, and prepare raw data, ensuring better quality
data that improves the performance of machine learning models.
Q2 : What is the difference between supervised and unsupervised filters in WEKA?
Ans : Supervised filters use class information to process data, while unsupervised filters do
not rely on class labels and apply the filter directly to the dataset.
Q3 : How can you add an attribute to a dataset in WEKA?
Ans : In WEKA, you can add an attribute using the Add filter, where you specify the attribute
name, type, and format, then apply the filter to add it to the dataset.
Q4 : Can pre-processing techniques affect the performance of machine learning models?
How?

Department of B. TECH CSE-AI & DS Page 21 Sakshi


PANIPAT INSTITUTE OF ENGINEERING TECHNOLOGY Data Mining & Predictive Modelling Lab

Ans : Yes, pre-processing techniques can significantly affect performance by improving data
quality, reducing noise, handling missing values, and ensuring consistency, which
results in better model training and evaluation.

Marks/Grade....................... Signature of Faculty

Department of B. TECH CSE-AI & DS Page 22

You might also like