Data Mining & Predictive Modeling Lab
Data Mining & Predictive Modeling Lab
(P.I.E.T)
Department of
B. TECH CSE- AI &DS
Assessment
Date of Practical
S.
Performance
No. Name of Practical
Total Marks
File Record
Attendance
Faculty
Viva (10)
Lab
(10)
(10)
(10)
(40)
Sign.
Total Marks
Aggregated Marks
Practical No. 1
Aim: Create an Employee Table with the help of Data Mining Tool WEKA.
Description: We need to create an Employee Table with training data set which includes
attributes like, Empid , Name , Salary , Gender , Contactno .
Procedure:
Steps:
1) Open Start → Programs → Accessories → Notepad.
2) Type the following training data set with the help of Notepad for Employee Table.
@relation employee
@attribute Empid numeric
@attributeName{Vijay,Shubham,Rajni,Pardeep,Jyoti,Rahul,Pinki,Roshni,Rajat,
Raveena}
@attribute Salary numeric
@attribute Gender {Male , Female}
@attribute Contactno numeric
@data
427,Vijay ,10000,Male,97854327
402,Shubham,25000,Male,89065432
403,Rajni,23000,Female,54326788
404,Pardeep,12000,Male,97654245
405,Jyoti,15000,Female,87142677
406,Rahul,17000,Male,87906543
407,Pinki,18000,Female,64367880
408,Roshni,20000,Female,76541208
409,parvesh,30000,Male,54236789
410,Raveena,29000,Female,56783452
4) Minimize the arff file and then open Start Programs weka-3-4.
7) Explorer shows many options. In that click on 'open file' and select the arff file
Output:
27 Vijay
Output:
After completing the steps, the employee table should appear in the WEKA interface, showing columns
for Empid, Name, Salary, Gender, and Contactno.
Viva Questions:
Q1 : What is the purpose of the WEKA tool?
Ans : WEKA is used for data mining tasks, providing tools for data preprocessing,
classification, clustering, and visualization.
Q2 : What file format is used to load data into WEKA?
Ans : WEKA primarily uses the ARFF (Attribute-Relation File Format) for loading datasets.
Q3 : How would you define an attribute in an ARFF file?
Ans : Attributes are defined using the @attribute keyword followed by the attribute name and
type (numeric or nominal).
Q4 : Can WEKA handle both numeric and categorical data?
Ans : Yes, WEKA can handle both numeric and categorical (nominal) data types.
Practical No. 2
Aim: Create a Weather Table with the help of Data Mining Tool WEKA.
Description: We need to create a Weather table with training data set which includes attributes
like outlook, temperature, humidity, windy, play.
Procedure:
Steps:
1) Open Start → Programs → Accessories → Notepad.
2) Type the following training data set with the help of Notepad for Employee Table.
@relation weather
@attribute Outlook {sunny, overcast, rainy}
@attribute Temperature numeric
@attribute Humidity numeric
@attribute Windy {TRUE, FALSE}
@attribute Play {yes, no}
@data
sunny, 85, 85, FALSE, no
sunny, 80, 90, TRUE, no
overcast, 83, 78, FALSE, yes
rainy, 70, 96, FALSE, yes
rainy, 68, 80, FALSE, yes
rainy, 65, 70, TRUE, no
overcast, 64, 65, TRUE, yes
sunny, 72, 95, FALSE, no
sunny, 69, 70, FALSE, yes
rainy, 75, 80, FALSE, yes
sunny, 75, 70, TRUE, yes
overcast, 72, 90, TRUE, yes
overcast, 81, 75, FALSE, yes
rainy, 71, 91, TRUE, no
4) Minimize the arff file and then open Start Programs weka-3-4.
7) Explorer shows many options. In that click on 'open file' and select the arff file .
Output :-
Output:
After completing the steps, the weather table will appear in the WEKA interface, displaying the columns
for Outlook, Temperature, Humidity, Windy, and Play.
Outcomes to be filled by student:
S. No. Outcomes
1 The student will understand how to create and save an ARFF file.
2 The student will learn how to load a dataset into WEKA.
3 The student will visualize a weather dataset in the WEKA explorer.
4 The student will understand basic data manipulation using WEKA.
Viva Questions:
Q1 : What is the primary purpose of using WEKA for this experiment?
Ans : WEKA is used for data analysis, visualization, and performing various machine
learning tasks, such as classification and clustering.
Q2 : What is the ARFF file format?
Ans : ARFF (Attribute-Relation File Format) is a text format used by WEKA to represent
datasets. It includes attribute definitions and the actual data.
Q3 : What are the attributes used in this weather dataset?
Ans : The attributes used are Outlook, Temperature, Humidity, Windy, and Play.
Q4 : How does WEKA help in data mining and machine learning?
Ans : WEKA provides various tools for data preprocessing, classification, clustering,
association, and visualization, helping to apply machine learning algorithms to datasets.
Practical No. 3
Aim: Apply Pre-Processing techniques to the training data set of Weather Table
Description: Real world databases are highly influenced to noise, missing and inconsistency
due to their queue size so the data can be pre-processed to improve the quality of data and
missing results and it also improves the efficiency.
@relation weather
@attribute Outlook {sunny, overcast, rainy}
@attribute Temperature numeric
@attribute Humidity numeric
@attribute Windy {TRUE, FALSE}
@attribute Play {yes, no}
@data
sunny, 85, 85, FALSE, no
sunny, 80, 90, TRUE, no
overcast, 83, 78, FALSE, yes
rainy, 70, 96, FALSE, yes
rainy, 68, 80, FALSE, yes
rainy, 65, 70, TRUE, no
overcast, 64, 65, TRUE, yes
sunny, 72, 95, FALSE, no
sunny, 69, 70, FALSE, yes
rainy, 75, 80, FALSE, yes
sunny, 75, 70, TRUE, yes
overcast, 72, 90, TRUE, yes
overcast, 81, 75, FALSE, yes
rainy, 71, 91, TRUE, no
3) After that the file is saved with .arff file format.
4) Minimize the .arff file and then open Start → Programs → WEKA-3.8.6
5) Click on WEKA-3.8.6, then WEKA dialog box is displayed on the screen.
6) In that dialog box, there are four modes, click on explorer.
7) Explorer shows many options. In that, click on 'open file' and select the .arff file.
8) Click on the edit button which shows the weather table on WEKA.
OUTPUT:
PRE-PROCESSING TECHNIQUES:
Add:-
Procedure:
1. Start → Programs → WEKA-3.8.6 → WEKA-3.8.6
2. Click on explorer
3. Click on open file
4. Select Weather .arff file and click on open.
5. Click on Choose button and select the Filters option.
6. In Filters, we have Supervised and Unsupervised data.
7. Click on Unsupervised data.
8. Select the attribute ADD →
9. A new window is opened.
10. In that, we enter attribute index, type, data format, nominal label values for Climate.
11. Click on OK.
12. Press the Apply button, then a new attribute is added to the Weather Table.
13. Save the file.
14. Click on the Edit button, it shows a new Weather Table on WEKA.
Weather Table after Adding new attribute CLIMATE (An updated viewer window with the Climate
attribute is added)
2. Remove:-
Procedure:
15. Start → Programs → WEKA-3.8.6 → WEKA-3.8.6
16. Click on explorer.
17. Click on open file.
18. Select Weather .arff file and click on open.
19. Click on Choose button and select the Filters option.
20. In Filters, we have Supervised and Unsupervised data.
21. Click on Unsupervised data.
22. Select the attribute Remove →
23. Remove attributes WINDY, PLAY →
24. Click Remove button and then Save.
25. Click on the Edit button, it shows a new Weather Table on WEKA
Weather Table after Removing attributes WINDY, PLAY (An updated viewer window without the
Windy and Play attributes)
3. Normalize:-
Procedure:
26. Start Programs WEKA-3.8.6 → WEKA-3.8.6
27. Click on explorer.
28. Click on open file.
29. Select Weather.arff file and click on open.
30. Click on Choose button and select the Filters option.
31. In Filters, we have Supervised and Unsupervised data.
32. Click on Unsupervised data.
33. Select the attribute Normalize
34. Select the attributes temparature humidity to Normalize.
35. Click on Apply button and then Save
36. Click on the Edit button, it shows a new Weather Table with normalized values on WEKA.
RESULTS: This program has been successfully executed by applying pre-processing techniques to
the training data set of Employee Table using WEKA tool.
Output: After applying the pre-processing techniques, the Weather Table will be updated with the
following changes:
• Added Attribute: A new attribute (e.g., Climate) will appear in the table.
• Removed Attributes: The Windy and Play attributes will no longer be present.
• Normalized Attributes: The values for Temperature and Humidity will be normalized to a specified
range.
Outcomes to be filled by student:
S. No. Outcomes
1 The student will understand how to add attributes to a dataset using WEKA.
2 The student will learn how to remove unwanted attributes from a dataset using
WEKA.
3 The student will understand the concept and application of normalization in data pre-
processing.
4 The student will be able to apply basic data pre-processing techniques to improve
dataset quality.
Viva Questions:
Q1 : What is the purpose of data pre-processing in data mining?
Ans : Data pre-processing is essential to clean, transform, and prepare raw data, improving
its quality and ensuring better results during data analysis and machine learning.
Q2 : What are the different pre-processing techniques you applied in this experiment?
Ans : The techniques applied are Add (adding new attributes), Remove (removing unwanted
attributes), and Normalization (scaling numeric values to a standard range).
Q3 : What is the importance of normalization in data pre-processing?
Ans : Normalization helps scale numeric attributes to a uniform range, improving the
performance of machine learning algorithms, especially those sensitive to the range of
values, such as distance-based algorithms.
Practical No. 4
Aim: Apply Pre-Processing techniques to the training data set of Employee Table.
Description: Real world databases are highly influenced to noise, missing and inconsistency
due to their queue size so the data can be pre-processed to improve the quality of data and
missing results and it also improves the efficiency.
@relation employee
@attribute Empid numeric
@attribute
Name{Vijay,Shubham,Rajni,Pardeep,Jyoti,Rahul,Pinki,Roshni,parvesh,Raveen
a}
@attribute Salary numeric
@attribute Gender {Male , Female}
@attribute Contactno numeric
@data
427,Vijay,10000,Male,97854327
402,Shubham,25000,Male,89065432
403,Rajni,23000,Female,54326788
404,Pardeep,12000,Male,97654245
405,Jyoti,15000,Female,87142677
406,Rahul,17000,Male,87906543
407,Pinki,18000,Female,64367880
408,Roshni,20000,Female,76541208
409,Parvesh,30000,Male,54236789
410,Raveena,29000,Female,56783452
OUTPUT:
27 Vijay
PRE-PROCESSING TECHNIQUES:
Add:-
Procedure:
1.Start → Programs → WEKA-3.8.6 → WEKA-3.8.6
2.Click on explorer
3.Click on open file
4.Select Employee.arff file and click on open.
5.Click on Choose button and select the Filters option.
6.In Filters, we have Supervised and Unsupervised data.
7.Click on Unsupervised data.
8.Click on OK.
9.Press the Apply button, then a new attribute is added to the Employee Table.
10.Save the file.
11.Click on the Edit button, it shows a new Employee Table on WEKA.
27 Vijay
Employee Table after Adding new attribute Address (An updated viewer window with the Employee
attribute added)
4. Remove:
Procedure:
15. Start → Programs → WEKA-3.8.6 → WEKA-3.8.6
16. Click on explorer.
17. Click on open file.
18. Select Employee.arff file and click on open.
19. Click on Choose button and select the Filters option.
20. In Filters, we have Supervised and Unsupervised data.
21. Click on Unsupervised data.
22. Select the attribute Remove →
23. Remove attributes Gender , Contactno →
24. Click Remove button and then Save.
25. Click on the Edit button, it shows a new Employee Table on WEKA.
27 Vijay
Employee Table after Removing attributes GENDER, CONTACTNO (An updated viewer window
without the Gender and Contactno attributes)
5. Normalize:-
Procedure:
26.Start Programs WEKA-3.8.6 → WEKA-3.8.6
27. Click on explorer.
28. Click on open file.
29. Select Employee.arff file and click on open.
30. Click on Choose button and select the Filters option.
31. In Filters, we have Supervised and Unsupervised data.
32. Click on Unsupervised data.
33. Select the attribute Normalize
34. Select the attributes Empid , Salary to Normalize.
35. Click on Apply button and then Save
36. Click on the Edit button, it shows a new Employee Table with normalized values on WEKA.
Vijay
RESULTS: This program has been successfully executed by applying pre-processing techniques to
the training data set of Weather Table using WEKA tool.
Requirements:
• Notepad (for creating the .arff file)
• WEKA software (version 3.8.6 or higher)
• Employee dataset.
Output: After applying the pre-processing techniques, the Employee Table will be updated with the
following changes:
• Added Attribute: A new attribute (e.g., Address) will appear in the table.
• Removed Attributes: The Gender and Contactno attributes will no longer be present.
• Normalized Attributes: The values for Empid and Salary will be normalized to a specified range.
Outcomes to be filled by student:
S. No. Outcomes
1 Understand how to add attributes to a dataset using WEKA.
2 Learn how to remove unwanted attributes from a dataset.
3 Understand normalization and its impact on numeric attributes.
4 Be able to apply pre-processing techniques to improve data quality.
Viva Questions:
Q1 : What is the significance of applying pre-processing techniques to a dataset?
Ans : Pre-processing helps clean, transform, and prepare raw data, ensuring better quality
data that improves the performance of machine learning models.
Q2 : What is the difference between supervised and unsupervised filters in WEKA?
Ans : Supervised filters use class information to process data, while unsupervised filters do
not rely on class labels and apply the filter directly to the dataset.
Q3 : How can you add an attribute to a dataset in WEKA?
Ans : In WEKA, you can add an attribute using the Add filter, where you specify the attribute
name, type, and format, then apply the filter to add it to the dataset.
Q4 : Can pre-processing techniques affect the performance of machine learning models?
How?
Ans : Yes, pre-processing techniques can significantly affect performance by improving data
quality, reducing noise, handling missing values, and ensuring consistency, which
results in better model training and evaluation.