0% found this document useful (0 votes)

19 views37 pages

FALLSEM2024-25 BCSE209L TH VL2024250101735 2024-07-25 Reference-Material-I

Uploaded by

Soosan Shabnam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views37 pages

FALLSEM2024-25 BCSE209L TH VL2024250101735 2024-07-25 Reference-Material-I

Uploaded by

Soosan Shabnam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 37

Classification and

Naïve Bayes Classifier

Classification vs. Prediction
• Classification
• predicts categorical class labels (discrete or nominal)
• classifies data (constructs a model) based on the training
set and the values (class labels) in a classifying attribute
and uses it in classifying new data
• Prediction
• models continuous-valued functions, i.e., predicts
unknown or missing values
• Typical applications
• Credit approval
• Target marketing
• Medical diagnosis
• Fraud detection
Classification: Definition

• Given a collection of records (training set )

• Each record contains a set of attributes, one of the attributes is the class.
• Find a model for class attribute as a function of the values of other
attributes.
• Goal: previously unseen records should be assigned a class as
accurately as possible.

• A test set is used to determine the accuracy of the model. Usually, the given data set is
divided into training and test sets, with training set used to build the model and test set used
to validate it.
Classification—A Two-Step Process
• Model construction: describing a set of
predetermined classes
• Each tuple/sample is assumed to belong to a
predefined class, as determined by the class
label attribute
• The set of tuples used for model
construction is training set
• The model is represented as classification
rules, decision trees, or mathematical
formulae
Classification—A Two-Step Process
• Model usage: for classifying future or unknown
objects
• Estimate accuracy of the model
• The known label of test sample is compared
with the classified result from the model
• Accuracy rate is the percentage of test set
samples that are correctly classified by the
model
• Test set is independent of training set,
otherwise over-fitting will occur
• If the accuracy is acceptable, use the model to classify data
tuples whose class labels are not known
Classification Process (1): Model Construction

Classification
Algorithms
Training
Data

NAME RANK YEARS TENURED Classifier

Mike Assistant Prof 3 no (Model)
Mary Assistant Prof 7 yes
Bill Professor 2 yes
Jim Associate Prof 7 yes IF rank = ‘professor’
Dave Assistant Prof 6 no
OR years > 6
Anne Associate Prof 3 no
THEN tenured = ‘yes’
Classification Process (2): Use the Model in Prediction

Classifier

Testing
Data Unseen Data

(Jeff, Professor, 4)
NAME RANK YEARS TENURED
Tom Assistant Prof 2 no Tenured?
Merlisa Associate Prof 7 no
George Professor 5 yes
Joseph Assistant Prof 7 yes
The Learning Process in spam mail
Example

Model Learning Model

Testin
g

● Number of recipients
● Size of message
● Number of attachments
● Number of "re's" in the
subject line
Email Server …
Classificati

An Example on

• A fish-packing plant wants to automate the

process of sorting incoming fish according to
species

• As a pilot project, it is decided to try to separate

sea bass from salmon using optical sensing
Classificati

An Example (continued) on

• Features/attributes:
Length
Lightness
Width
Position of mouth
•
Classificati

An Example (continued) on

 Preprocessing: Images of different

fishes are isolated from one
another and from background;

 Feature extraction: The information

of a single fish is then sent to a
feature extractor, that measure
certain “features” or “properties”;

 Classification: The values of these

features are passed to a classifier
that evaluates the evidence
presented, and build a model to
discriminate between the two
species
Classificati

An Example (continued) on

Domain knowledge:
◦ A sea bass is generally longer than a salmon
Related feature: (or attribute)
◦ Length
Training the classifier:
◦ Some examples are provided to the classifier in this
form: <fish_length, fish_name>
◦ These examples are called training examples
◦ The classifier learns itself from the training examples,
how to distinguish Salmon from Bass based on the
fish_length
Classificati

An Example (continued) on

• Classification model (hypothesis):

◦ The classifier generates a model from the training data to
classify future examples (test examples)
◦ An example of the model is a rule like this:
◦ If Length >= l* then sea bass otherwise salmon
◦ Here the value of l* determined by the classifier
Testing the model
◦ Once we get a model out of the classifier, we may use the
classifier to test future examples
◦ The test data is provided in the form <fish_length>
◦ The classifier outputs <fish_type> by checking fish_length
against the model
Classificati

An Example (continued) on

Test/Unlabeled
• So the overall Training Data
Data
classification process
Preprocessing Preprocessing
goes like this  , and feature , and feature
extraction extraction

Feature vector Feature vector

Testing against
Training model/
Classification

Model Prediction/
Evaluation
Classificati

An Example (continued)
on

Pre- 12, salmon

Training If len > 12,
processing, 15, sea bass then sea bass
Feature 8, salmon else salmon
extraction 5, sea bass

Training data Model

Feature vector
Labeled data

Pre- sea bass (error!)

15, salmon
processing, Test/ salmon (correct)
10, salmon
Feature Classify sea bass
18, ?
extraction salmon
8, ?

Test data Feature vector Evaluation/Prediction

Unlabeled data
Classificati

An Example (continued) on

• Why error?
Insufficient training data
Too few features
Too many/irrelevant features
Overfitting / specialization
Classificati

An Example (continued)
on

Pre- If ltns > 6 or

12, 4, salmon
processing, Training len*5+ltns*2>100
15, 8, sea bass
Feature then sea bass else
8, 2, salmon
extraction salmon
5, 10, sea bass

Training data Model

Feature vector

Pre- salmon (correct)

15, 2, salmon
processing, Test/ salmon (correct)
10, 7, salmon
Feature Classify sea bass
18, 7, ?
extraction salmon
8, 5, ?

Test data Feature vector Evaluation/Prediction

Linear, Non-linear, Multi-class
and
Multi-label classification
Linear Classification
• A linear classifier achieves this by making
a classification decision based on the value of
a linear combination of the characteristics.
• A classification algorithm (Classifier) that makes its
classification based on a linear predictor function
combining a set of weights with the feature vector

• Decision boundaries is flat

• Line, plane, ….
• May involve non-linear operations
Linear Classifiers

Email Length

New Recipients
Linear Classifiers

Email Length
Any of these would
be fine..

..but which is best?

New Recipients
No Linear Classifier can cover all instances

Email Length
How would you
classify this data?

New Recipients
• Ideally, the best decision boundary should
be the one which provides an optimal
performance such as in the following
figure
No Linear Classifier can cover all
instances

Email Length

New Recipients
What is multiclass
• Output
• In some cases, output space can be very large (i.e., K
is very large)
• Each input belongs to exactly one class
(c.f. in multilabel, input belongs to many classes)

26
Multi-Classes Classification
• Multi-class classification is simply
classifying objects into any one of multiple
categories. Such as classifying just into
either a dog or cat from the dataset.
• 1. When there are more than two
categories in which the images can be
classified, and
• 2. An image does not belong to more than
one class
•
• If both of the above conditions are
satisfied, it is referred to as a multi-class
image classification problem
Multi-label classification
• When we can classify an image
into more than one class (as in the
image beside), it is known as a • .
multi-label image classification
problem.

• Multi-label classification is a type

of classification in which an object
can be categorized into more than
one class.

• For example, In the image dataset,

we will classify a picture as These are all labels of the given images. Each
the image of a dog or cat and image here belongs to more than one class
also classify the same image based and hence it is a multi-label image
on the breed of the dog or cat classification problem.
Binary Vs Multi-class
Multi class Vs multi label classification
Naïve Bayes Classifier

Where,
P(A|B) is Posterior probability: Probability of hypothesis A on the
observed event B.
P(B|A) is Likelihood probability: Probability of the evidence
given that the probability of a hypothesis is true.
P(A) is Prior Probability: Probability of hypothesis before
observing the evidence.
P(B) is Marginal Probability: Probability of Evidence.
• To find the normalized probability,

• So, the given new instance falls under the classified target as “No”.

CMSX - C U F1 - 2015 09 - 8044840g1 PDF
No ratings yet
CMSX - C U F1 - 2015 09 - 8044840g1 PDF
8 pages
Classification
No ratings yet
Classification
22 pages
Data Mining Classification and Prediction
No ratings yet
Data Mining Classification and Prediction
17 pages
Unit 4 ML
No ratings yet
Unit 4 ML
28 pages
Classification & Prediction: - Shailesh Yadav Central University of Rajasthan
No ratings yet
Classification & Prediction: - Shailesh Yadav Central University of Rajasthan
28 pages
10 Classification2022
No ratings yet
10 Classification2022
20 pages
Unit 3
No ratings yet
Unit 3
27 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
Unit6 - 1 Classification-and-Prediction-Basics
No ratings yet
Unit6 - 1 Classification-and-Prediction-Basics
12 pages
ClassificationandPrediction Module3
No ratings yet
ClassificationandPrediction Module3
88 pages
ML Unit-2
No ratings yet
ML Unit-2
51 pages
19-Introduction Classification Algorithm-18-09-2024
No ratings yet
19-Introduction Classification Algorithm-18-09-2024
102 pages
ML Unit-Ii
No ratings yet
ML Unit-Ii
37 pages
Unit 5 Classification PDF
No ratings yet
Unit 5 Classification PDF
131 pages
ML Notes - 2025
No ratings yet
ML Notes - 2025
145 pages
ML 3RD Unit
No ratings yet
ML 3RD Unit
67 pages
ML Mid Syllabus
No ratings yet
ML Mid Syllabus
182 pages
Data Mining-Unit-3
No ratings yet
Data Mining-Unit-3
16 pages
Classification
No ratings yet
Classification
21 pages
ML LVC 3 Post-Session Summary
No ratings yet
ML LVC 3 Post-Session Summary
16 pages
Classifiers (Support Vector Machines, Decision Trees, Nearest Neighbor Classification)
No ratings yet
Classifiers (Support Vector Machines, Decision Trees, Nearest Neighbor Classification)
16 pages
6 Classification
No ratings yet
6 Classification
53 pages
Lecture-5 Classification in ML
No ratings yet
Lecture-5 Classification in ML
50 pages
ML Unit 2
No ratings yet
ML Unit 2
31 pages
For More Visit WWW - Ktunotes.in
No ratings yet
For More Visit WWW - Ktunotes.in
21 pages
Artificial Intelligence Lec 2
No ratings yet
Artificial Intelligence Lec 2
17 pages
Data Mining and Warehousing Mod3
No ratings yet
Data Mining and Warehousing Mod3
69 pages
New Classification11
No ratings yet
New Classification11
98 pages
Classification and Prediction
No ratings yet
Classification and Prediction
14 pages
18mca52c U3
No ratings yet
18mca52c U3
8 pages
Classification
No ratings yet
Classification
15 pages
ch-4 FML
No ratings yet
ch-4 FML
13 pages
Classification (Part II)
No ratings yet
Classification (Part II)
162 pages
Lec10 Intro ML
No ratings yet
Lec10 Intro ML
93 pages
V1-CH-6-Classification and Prediction
No ratings yet
V1-CH-6-Classification and Prediction
38 pages
Unit-5 3161610
No ratings yet
Unit-5 3161610
92 pages
UNIT 4 Supervised Learning
No ratings yet
UNIT 4 Supervised Learning
38 pages
Classification
No ratings yet
Classification
33 pages
Big Data Analytics - Unit 3
No ratings yet
Big Data Analytics - Unit 3
55 pages
ML 4
No ratings yet
ML 4
32 pages
3ML.02.MainConcepts Evaluation
No ratings yet
3ML.02.MainConcepts Evaluation
35 pages
DL Unit 3
No ratings yet
DL Unit 3
8 pages
AI Lec 4
No ratings yet
AI Lec 4
35 pages
ML Module4 Classification
No ratings yet
ML Module4 Classification
79 pages
Chapter 19
No ratings yet
Chapter 19
30 pages
ITP4-Lesson 4-Week 7-8
No ratings yet
ITP4-Lesson 4-Week 7-8
18 pages
4 22865 IS465 2019 1 2 1 08ClassBasic
No ratings yet
4 22865 IS465 2019 1 2 1 08ClassBasic
43 pages
Lecture 9
No ratings yet
Lecture 9
27 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
87 pages
6.data Mining - Classification
No ratings yet
6.data Mining - Classification
37 pages
Bi Unit 5
No ratings yet
Bi Unit 5
20 pages
CH 6
No ratings yet
CH 6
24 pages
Lecture W1c UG
No ratings yet
Lecture W1c UG
33 pages
Classification and Prediction Lecture-22,23,24,25,26,27, 28: Dr. Sudhir Sharma Manipal University Jaipur
No ratings yet
Classification and Prediction Lecture-22,23,24,25,26,27, 28: Dr. Sudhir Sharma Manipal University Jaipur
43 pages
Unit 3 (DWDM)
No ratings yet
Unit 3 (DWDM)
23 pages
Day35 Classification Algorithm
No ratings yet
Day35 Classification Algorithm
5 pages
DW Unit 6-Min
No ratings yet
DW Unit 6-Min
44 pages
ML Unit-1-1
No ratings yet
ML Unit-1-1
16 pages
Machine Learning
No ratings yet
Machine Learning
28 pages
Chap 5 Learning
No ratings yet
Chap 5 Learning
56 pages
Java Programming for Beginners: Programming
From Everand
Java Programming for Beginners: Programming
Stephanie Mwaniki
No ratings yet
Senior High School 2nd Quarter Examination General Mathematics Competencies
No ratings yet
Senior High School 2nd Quarter Examination General Mathematics Competencies
1 page
Rahul Paper
No ratings yet
Rahul Paper
23 pages
Multiple-Choice Test False-Position Method of Solving A Nonlinear Equation
No ratings yet
Multiple-Choice Test False-Position Method of Solving A Nonlinear Equation
4 pages
Transitioning From Steady State To Dynamics 1
No ratings yet
Transitioning From Steady State To Dynamics 1
24 pages
Ch3 - Laser Beam Propagation (Laser Systems Engineering - 2016 - Keith Kasunic)
No ratings yet
Ch3 - Laser Beam Propagation (Laser Systems Engineering - 2016 - Keith Kasunic)
16 pages
International Handbook of Emotions in Education: Educational Psychology in Practice
No ratings yet
International Handbook of Emotions in Education: Educational Psychology in Practice
4 pages
3 - MGT4205 Midterm Assignment W23
No ratings yet
3 - MGT4205 Midterm Assignment W23
8 pages
Cataloge2016finamarch24final PDF
No ratings yet
Cataloge2016finamarch24final PDF
31 pages
Statys Catalogue-Pages 2019-09 DCG En-I
No ratings yet
Statys Catalogue-Pages 2019-09 DCG En-I
2 pages
12.study of Sound
No ratings yet
12.study of Sound
4 pages
Tamil Nadu Public Service Commission
No ratings yet
Tamil Nadu Public Service Commission
21 pages
Transformer Pressure Relief Valve For Power Transformers
No ratings yet
Transformer Pressure Relief Valve For Power Transformers
4 pages
Caterpillar Dozer
No ratings yet
Caterpillar Dozer
1 page
Fai Unit 1
No ratings yet
Fai Unit 1
12 pages
Shell Corena S2 P - TDS PDF
100% (1)
Shell Corena S2 P - TDS PDF
3 pages
Kundalini Healing: Master Teacher Level
100% (4)
Kundalini Healing: Master Teacher Level
38 pages
256K (32K X 8) Paged Parallel Eeprom AT28C256: Features
No ratings yet
256K (32K X 8) Paged Parallel Eeprom AT28C256: Features
27 pages
Eng. Statics Lab Manual Spring 2024
No ratings yet
Eng. Statics Lab Manual Spring 2024
35 pages
The Water Transmission Code
No ratings yet
The Water Transmission Code
74 pages
Bloodstain Pattern Analysis
No ratings yet
Bloodstain Pattern Analysis
105 pages
Guide X550 X350
No ratings yet
Guide X550 X350
53 pages
NIKE
No ratings yet
NIKE
18 pages
Macroram: Remote Raman Measurements Using An External Probe
No ratings yet
Macroram: Remote Raman Measurements Using An External Probe
2 pages
Profile - DESB - May2024-General Vers.0
No ratings yet
Profile - DESB - May2024-General Vers.0
43 pages
Stucts Operator in C++
No ratings yet
Stucts Operator in C++
18 pages
C1 - Unidad 1 - Resumen PDF
No ratings yet
C1 - Unidad 1 - Resumen PDF
22 pages
Syllabus Format
No ratings yet
Syllabus Format
6 pages
A380 AIRBUS Case & Questions
No ratings yet
A380 AIRBUS Case & Questions
4 pages
Transcendentalism Is An American Literature
100% (1)
Transcendentalism Is An American Literature
33 pages