data mining steps

The document discusses data selection and preprocessing in data mining, highlighting the importance of data sets, test data, and trained data. It explains that test data is used to evaluate machine learning models after training, while training data is essential for building effective predictive models. Additionally, it outlines data cleaning techniques, including handling missing values and removing noise to improve data quality.

Uploaded by

tushikasahu5

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

data mining steps

Uploaded by

tushikasahu5

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Exp 3

Data selection define

Data set :- specifies data attribute are consider as a data system
A data set is a collection of data that can be used for data mining
purposes. A data set can be composed of different types of data,
such as numerical, categorical, textual, spatial, temporal, and so on. A
data set can also have different characteristics, such as size,
dimensionality, quality, and structure. Depending on the data mining
task and the data mining method, you may need to preprocess,
transform, or manipulate your data set before applying data mining
techniques.
Test Data:- known as a data which applies to perform specific task
You will need unknown information to test your machine learning
model after it was created (using your training data). This data is
known as testing data, and it may be used to assess the progress and
efficiency of your algorithms' training as well as to modify or optimize
them for better results.
 Showing the original set of data.
 Be large enough to produce reliable projections
This dataset needs to be "unseen" and recent. This is because the
training data was already "learned" by your model. You can decide if
it is operating successfully or when it need more training data to
fulfill your standards by observing how it performs on fresh test data.
Test data provides as a last, real check if an unknown dataset was
correctly trained by the machine learning algorithm.

Trained Data :- Approach to learn by data

Testing data is used to determine the performance of the trained
model, whereas training data is used to train the machine learning
model. Training data is the power that supplies the model in machine
learning, it is larger than testing data. Because more data helps to
more effective predictive models. When a machine learning
algorithm receives data from our records, it recognizes patterns and
creates a decision-making model.
Algorithms allow a company's past experience to be used to make
decisions. It analyzes all previous cases and their results and, using
this data creates models to score and predict the outcome of current
cases. The more data ML models have access to, the more reliable
their predictions get over time.

Exp 4
Preprocessing step
Data cleaning :- missing value, unwanted part, noise remove
1. Missing value:-
 Fill manually
 Remove attribute
 Use global constant nil, null
2. Noise :- unwanted data
 Smoothing techniques
 Data order
 Bin divide
 Smoothing technique
 Smoothing bin by mean
 Smoothing bin by medium
 Smoothing bin by Boundaries
3. Reggresion
4. cluster

How to Build a Machine Learning Model | by Chanin Nantasenamat | Towards Data Science
No ratings yet
How to Build a Machine Learning Model | by Chanin Nantasenamat | Towards Data Science
37 pages
Improve Model Accuracy With Data Pre-Processing
No ratings yet
Improve Model Accuracy With Data Pre-Processing
11 pages
FORM 1 KSSM Daily Lesson Plan Chapter 2: It's A Small World
100% (3)
FORM 1 KSSM Daily Lesson Plan Chapter 2: It's A Small World
1 page
Unit_I_1
No ratings yet
Unit_I_1
203 pages
Unit 2
No ratings yet
Unit 2
18 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
1635838720082
No ratings yet
1635838720082
35 pages
Chapter 2 Data Preprocessing
No ratings yet
Chapter 2 Data Preprocessing
23 pages
Lect 04 Preprocessing Structured
No ratings yet
Lect 04 Preprocessing Structured
39 pages
Unit 4_Question Bank and answers
No ratings yet
Unit 4_Question Bank and answers
23 pages
How To Prepare Data For Machine Learning
No ratings yet
How To Prepare Data For Machine Learning
34 pages
03preprocessing Part1
No ratings yet
03preprocessing Part1
21 pages
Workflow of A Machine Learning Project
No ratings yet
Workflow of A Machine Learning Project
12 pages
E-Notes_33718_Content_Document_20250325122736PM
No ratings yet
E-Notes_33718_Content_Document_20250325122736PM
18 pages
DS Module2 L3 L13
No ratings yet
DS Module2 L3 L13
43 pages
Data Preprocessing in Data Mining
No ratings yet
Data Preprocessing in Data Mining
11 pages
Notes Unit 1-3 Part-II
No ratings yet
Notes Unit 1-3 Part-II
20 pages
CSC 3301-Lecture06 Introduction To Machine Learning
No ratings yet
CSC 3301-Lecture06 Introduction To Machine Learning
56 pages
Model Evaluation
No ratings yet
Model Evaluation
39 pages
Session 2 - Data Pre-Processing
No ratings yet
Session 2 - Data Pre-Processing
19 pages
36.why Data Preprocessing Introduction
No ratings yet
36.why Data Preprocessing Introduction
37 pages
peterl/teaching/DM: E C I I
No ratings yet
peterl/teaching/DM: E C I I
8 pages
peterl/teaching/DM: E C I I
No ratings yet
peterl/teaching/DM: E C I I
8 pages
L2 - SLM Notes (Pre-Processing)
No ratings yet
L2 - SLM Notes (Pre-Processing)
37 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
Module 2
No ratings yet
Module 2
8 pages
Data Preprocessing
No ratings yet
Data Preprocessing
4 pages
Lecture - 2 Classification (Machine Learning Basic and KNN)
No ratings yet
Lecture - 2 Classification (Machine Learning Basic and KNN)
90 pages
Data Preprocessing Implementation 13112023 061217pm
No ratings yet
Data Preprocessing Implementation 13112023 061217pm
31 pages
Unit - II
No ratings yet
Unit - II
56 pages
DATA 2024_dist
No ratings yet
DATA 2024_dist
72 pages
Machine Learning Lpu Notes
No ratings yet
Machine Learning Lpu Notes
187 pages
Data Preprocessing - Cleaning and Normalization
No ratings yet
Data Preprocessing - Cleaning and Normalization
11 pages
Dmml Notes
No ratings yet
Dmml Notes
89 pages
Lecture Source: Books by Tan, Steinbach, Kumar Han, Kamber & Pei Evans Dinesh Kumar + Experiential Knowledge
No ratings yet
Lecture Source: Books by Tan, Steinbach, Kumar Han, Kamber & Pei Evans Dinesh Kumar + Experiential Knowledge
40 pages
Intro ML 1 Day
No ratings yet
Intro ML 1 Day
43 pages
Air quality prediction using machine learning
No ratings yet
Air quality prediction using machine learning
29 pages
Data Mining: Concepts and Techniques: - Chapter 3
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 3
52 pages
Estimasi Anggaran Biaya Google Adwords Iklan Website
No ratings yet
Estimasi Anggaran Biaya Google Adwords Iklan Website
54 pages
Data Mining: Concepts and Techniques: September 16, 2020 1
No ratings yet
Data Mining: Concepts and Techniques: September 16, 2020 1
46 pages
ML
No ratings yet
ML
12 pages
AI351 Lecture 1
No ratings yet
AI351 Lecture 1
32 pages
(A) What Is Machine Learning? Explain The Impact of Various Machine Learning Techniques in Today's World
No ratings yet
(A) What Is Machine Learning? Explain The Impact of Various Machine Learning Techniques in Today's World
6 pages
Machine Learning
No ratings yet
Machine Learning
29 pages
ML Notes
No ratings yet
ML Notes
7 pages
Data_in_machine_learning
No ratings yet
Data_in_machine_learning
7 pages
SML Updated UNIT-2
No ratings yet
SML Updated UNIT-2
43 pages
Machine Learning Chapter 2
No ratings yet
Machine Learning Chapter 2
37 pages
Pattern Recognition Application
No ratings yet
Pattern Recognition Application
43 pages
Study Material I
No ratings yet
Study Material I
140 pages
Preprocessing
No ratings yet
Preprocessing
90 pages
What Is Data Mining: Effective Data Collection Warehousing
No ratings yet
What Is Data Mining: Effective Data Collection Warehousing
21 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
64 pages
Social Media Analytics Techniques[1] (1)
No ratings yet
Social Media Analytics Techniques[1] (1)
77 pages
Presentation-2 Data Pre-Processing in Machine Learning
No ratings yet
Presentation-2 Data Pre-Processing in Machine Learning
11 pages
Data Preparation
No ratings yet
Data Preparation
17 pages
02 Data_preprocessing -4,5,6
No ratings yet
02 Data_preprocessing -4,5,6
54 pages
DWM
No ratings yet
DWM
14 pages
Main Dock Pin
No ratings yet
Main Dock Pin
31 pages
Unit6 Part3 General Procedure
No ratings yet
Unit6 Part3 General Procedure
19 pages
1-Introduction to Machine Learning
No ratings yet
1-Introduction to Machine Learning
61 pages
Adapted From Deped Order No. 013, S. 2020
No ratings yet
Adapted From Deped Order No. 013, S. 2020
8 pages
MLA 8th Edition Paper Template
No ratings yet
MLA 8th Edition Paper Template
3 pages
Best MBA HR Colleges in Bangalore
No ratings yet
Best MBA HR Colleges in Bangalore
4 pages
Theoretical Knowledge - 2
No ratings yet
Theoretical Knowledge - 2
13 pages
Educ 145
No ratings yet
Educ 145
27 pages
Can The Use of Web-Based Comic Strip Creation Tool Facilitate EFL Learners'
No ratings yet
Can The Use of Web-Based Comic Strip Creation Tool Facilitate EFL Learners'
6 pages
Chapter 4 - Theories of Cognitive Development
No ratings yet
Chapter 4 - Theories of Cognitive Development
12 pages
An Analysis of Various Training Evaluation Models
No ratings yet
An Analysis of Various Training Evaluation Models
9 pages
Physical Education: Position
No ratings yet
Physical Education: Position
2 pages
Riplsquestionnaire19 PDF
No ratings yet
Riplsquestionnaire19 PDF
2 pages
Blue Simple Artificial Intelligence Presentation
No ratings yet
Blue Simple Artificial Intelligence Presentation
12 pages
Oxford Handbooks Online: Creativity in The Secondary Music Classroom
No ratings yet
Oxford Handbooks Online: Creativity in The Secondary Music Classroom
21 pages
Lesson Plan (Good Decisions Seminar)
No ratings yet
Lesson Plan (Good Decisions Seminar)
3 pages
Ai Worksheet
No ratings yet
Ai Worksheet
5 pages
Ec 107 Mid
No ratings yet
Ec 107 Mid
38 pages
TEng 2 Course Outline
100% (1)
TEng 2 Course Outline
9 pages
VRTC - Test Bank TQDS Ver 2.1.9 1
0% (1)
VRTC - Test Bank TQDS Ver 2.1.9 1
12 pages
UCSP Second Quarter M08
No ratings yet
UCSP Second Quarter M08
19 pages
HTP Ver. English
No ratings yet
HTP Ver. English
16 pages
Behavior Plan
No ratings yet
Behavior Plan
4 pages
Lardo's 2
No ratings yet
Lardo's 2
2 pages
Animals Sign Language
No ratings yet
Animals Sign Language
18 pages
Dela Cruz ED11 E Portfolio
No ratings yet
Dela Cruz ED11 E Portfolio
32 pages
Iready Classroom Math Look Fors Facilitate Meaningful Mathematical Discourse
No ratings yet
Iready Classroom Math Look Fors Facilitate Meaningful Mathematical Discourse
1 page
Ict For Teacher Professional Development M
No ratings yet
Ict For Teacher Professional Development M
14 pages
Perception of Junior High School Students About The Use of E-Books As Learning Sources
No ratings yet
Perception of Junior High School Students About The Use of E-Books As Learning Sources
7 pages
Provisional Result Piyush Prasad Swain
No ratings yet
Provisional Result Piyush Prasad Swain
2 pages
SANCHEZ - Thesis SPC 2023
No ratings yet
SANCHEZ - Thesis SPC 2023
2 pages
17 Monad New Fee After 1st April2015
No ratings yet
17 Monad New Fee After 1st April2015
3 pages

data mining steps

Uploaded by

data mining steps

Uploaded by

Exp 3

Data selection define

Trained Data :- Approach to learn by data

You might also like