0% found this document useful (0 votes)

160 views17 pages

Confusion Matrix: For Evaluating The KNN Model

The document discusses confusion matrices, which are used to evaluate classification models. It defines true positives, true negatives, false positives, and false negatives. It provides an example confusion matrix for predicting prostate cancer and calculates various statistics from it, such as accuracy (88%), misclassification rate (12%), precision (45%), sensitivity (90%), and specificity (88%). It interprets the results, noting that the model has high false positives, resulting in low precision, meaning it falsely predicts prostate cancer 55% of the time.

Uploaded by

SUHRIT BISWAS

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

160 views17 pages

Confusion Matrix: For Evaluating The KNN Model

Uploaded by

SUHRIT BISWAS

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Confusion Matrix

for evaluating the Knn model

KNN Algorithm
• Generates a model that can predict based on supervised learning.

• Trained on labelled data

• It is a nearest neighbour Algorithm

• Prediction of new data is done based on feature similarity with the

neighbours.

• Also called feature similarly algorithm

• We should choose a right ‘K’….usually sqrt(n)

Examples where KNN is used to predict

• Amazon uses it to recommend books to the new customers

• Bank uses it to approve loan

• Doctors use it to predict diabetics

• Predicting risk factor of prostrate cancer

Steps for Developing a KNN Algorithm
Step -1: Data Collection followed by importing data in R

Step -2 Prepare and explore data

Step 3: Normalizing numeric data

Step 4: Creating training and test data set

Step 5: Training a model on data

Step 6: Evaluate the model performance….Using Confusion Matrix

What is a Confusion Matrix (CM)

• useful tool for calibrating the output of a model.....a tool for evaluating the
performance of the model
• examines all possible outcomes...True Positive (TP), True Negative(TN), False
Positive(FP), False Negative(FN)
• categorises the predictions against the actual values....it is a 2-dim matrix of predicted
values X actual values
• gives lot of additional data in addition to the accuracy of the KNN model.
• has this name as it shows how confused the model is between predicted outcome
values and actual outcome values.
• The columns in CM represent actual classes and rows represent the predicted
classes….or VV
How does Confusion Matrix (CM) look like?
What does TP, TN, FP, FN stand for in the Confusion Matrix (CM)?

In all there can be 4 possible outcomes:

TP: these are cases where the model correctly predicts outcome Y
FP: these are cases where the model incorrectly predicts outcome Y
TN: these are cases where the model correctly predicts N
FN: these are cases where the model incorrectly predicts outcome N

We can have binary matrix with two levels or more.

In case of fibre identification problem, there are three level : Cotton, Silk, Wool for
both Actual and predicted values........CM in this case will be a 3X3 matrix.
Examples
Confusion Matrix (CM)
For predicting mails as Spam or Non-spam
Example: Confusion Matrix (CM)....
for predicting Cancer or No-Cancer in diagnoses

Example: Confusion Matrix (CM)....

for predicting Cancer is benign or malignant
For prediction of prostrate cancer among men
based on their medical report for test 1 and test 2.

• At random sample tests (say, test 1 and test 2)

performed on 500 men ,
• Of these, 50 actually have prostate cancer.
• I predicted 100 total cases of prostrate cancer
• 45 of which are actually having prostrate cancer.

Confusion Matrix

Actual Positive Actual Negative

Type 1
Predicted Positive TP 45 FP 55

Predicted Negative FN 5 TN 395

Type 2
We can measure other additional information in Confusion Matrix

Actual values
Predicted values
For prediction of prostrate cancer among men
based on their medical report for test 1 and test 2.

• At random sample tests performed on500 men ,

Calculate the statistics using
• Of these, 50 actually have prostate cancer. labelled confusion matrix:
• I predicted 100 total cases of prostrate cancer
1. Accuracy (all correct / all) = TP + TN / TP + TN + FP + FN
• 45 of which are actually having PC. (45 + 395) / 500 = 440 / 500 = 0.88 or 88% Accuracy
Actual Actual
2. Misclassification (all incorrect / all) = FP + FN / TP + TN + FP +
Positive Negative FN
Predicted TP 45 FP 55 (55 + 5) / 500 = 60 / 500 = 0.12 or 12% Misclassification
Precision
Positive You can also just do 1 — Accuracy,
so: 1–0.88 = 0.12 or 12% Misclassification
Predicted FN 5 TN 395
3. Precision (true positives / predicted positives) = TP / TP + FP
Negative 45 / (45 + 55) = 45 / 100 = 0.45 or 45% Precision
Sensitivity Specificity
4. Sensitivity aka Recall (true positives / all actual positives) = TP /
TP + FN
45 / (45 + 5) = 45 / 50 = 0.90 or 90% Sensitivity

5. Specificity (true negatives / all actual negatives) =TN / TN + FP

395 / (395 + 55) = 395 / 450 = 0.88 or 88% Specificity
Therefore, KNN model built for prediction of prostrate cancer among
men based on their medical report for test 1 and test 2
(hypothetical….data not provided) with outcome as “have prostrate
cancer” or “do not have prostrate cancer” is 88% accurate.
Interpretation of the Results
• A confusion matrix can help us evaluate the performance of our models, it
provides statistics ….Accuracy, Precision, Sufficiency, etc….one or more of
these can be used to evaluate the model.

• In the given example (prostrate cancer), it can be seen that there are high
incidences of false positives (45 out of 45+55=100). Therefore, precision
given by TP/TP+FP ……is just 45%. This means that we are falsely
predicting prostrate cancer 55% of the time. Our model is thus NOT
precise.
Command in R to execute the Confusion Matrix

confusionMatrix(data = m1,factor(fibre_test_target))

1. We specify the predicted data and the actual data as the arguments
2. Both the datasets should be of factor type….convert them to factor in case it is not

conf_matrix <- confusionMatrix(data = m1,factor(fibre_test_target))

Assign the output of CM to a variable

print (conf_matrix)

Print confusion matrix to evaluate the model based on the statistics

R programme for predicting the fibre type (cotton, silk, wool) using KNN Model
Result of confusion matrix in KNN Model

Model’s accuracy is 1

SAP ERP Implementation for Diet Center
0% (1)
SAP ERP Implementation for Diet Center
3 pages
VW Strategy 2018: Big Data's Role
No ratings yet
VW Strategy 2018: Big Data's Role
5 pages
Ch6 Multiple Regression
No ratings yet
Ch6 Multiple Regression
29 pages
A-CAT Corp Forecasting Paper - Final
No ratings yet
A-CAT Corp Forecasting Paper - Final
16 pages
Q.1What Is The Competitive Priorities For Synthite?: Input Data
No ratings yet
Q.1What Is The Competitive Priorities For Synthite?: Input Data
4 pages
Logistic Regression Assignment Guide
No ratings yet
Logistic Regression Assignment Guide
20 pages
Coors Brewing Company Strategic Analysis
No ratings yet
Coors Brewing Company Strategic Analysis
8 pages
Nyawera James Xolani 2019
No ratings yet
Nyawera James Xolani 2019
258 pages
Microfridge Product Innovation Charter
No ratings yet
Microfridge Product Innovation Charter
1 page
Whirlpool Corporation S Global Strategy
0% (1)
Whirlpool Corporation S Global Strategy
26 pages
CVP Analysis for Liberty Shoe Co.
No ratings yet
CVP Analysis for Liberty Shoe Co.
5 pages
Soal Capital Budgeting Chapter 11
No ratings yet
Soal Capital Budgeting Chapter 11
1 page
Wil's Grill
No ratings yet
Wil's Grill
6 pages
Training Challenges for Managers
No ratings yet
Training Challenges for Managers
5 pages
Simulation & Its Applications
No ratings yet
Simulation & Its Applications
19 pages
National Cranberry Cooperative Case Solution
No ratings yet
National Cranberry Cooperative Case Solution
5 pages
Boeing's Dreamliner Marketing Insights
No ratings yet
Boeing's Dreamliner Marketing Insights
8 pages
Game Theory in IPL Auctions
No ratings yet
Game Theory in IPL Auctions
15 pages
NS Sarah Talley
100% (1)
NS Sarah Talley
3 pages
7 11 Japan
No ratings yet
7 11 Japan
27 pages
Assignment
No ratings yet
Assignment
2 pages
Case Study 1: Jindi Enterprises: Group No 03
No ratings yet
Case Study 1: Jindi Enterprises: Group No 03
6 pages
Astro Vision TN
No ratings yet
Astro Vision TN
19 pages
Case Analysis
No ratings yet
Case Analysis
13 pages
McDonald's Corporation Overview and Analysis
100% (1)
McDonald's Corporation Overview and Analysis
19 pages
Kitopi Assignment 2 Report
No ratings yet
Kitopi Assignment 2 Report
13 pages
Sabena's Leadership and Crisis Management
No ratings yet
Sabena's Leadership and Crisis Management
2 pages
Dairy Industry Expansion Analysis
No ratings yet
Dairy Industry Expansion Analysis
2 pages
Demand Forecasting for Plumbing Products
No ratings yet
Demand Forecasting for Plumbing Products
10 pages
Analyzing Competitive Forces in Strategy
100% (1)
Analyzing Competitive Forces in Strategy
13 pages
Ben & Jerry's Mission and Business Strategy
No ratings yet
Ben & Jerry's Mission and Business Strategy
20 pages
The Coop Market Research Decision
0% (1)
The Coop Market Research Decision
1 page
Hindustan Foods Case Study Analysis
No ratings yet
Hindustan Foods Case Study Analysis
7 pages
Lean Healthcare Transformation
No ratings yet
Lean Healthcare Transformation
2 pages
Anjli Doc Aldi Case Study.
No ratings yet
Anjli Doc Aldi Case Study.
9 pages
Petro Refinery Cost Analysis
No ratings yet
Petro Refinery Cost Analysis
6 pages
0092aSCM in HAL
No ratings yet
0092aSCM in HAL
17 pages
Gillette India Market Analysis Report
100% (1)
Gillette India Market Analysis Report
26 pages
Instructions - Please Read The Instructions Carefully! Contains Information About Bonuses and Penalties!!!
No ratings yet
Instructions - Please Read The Instructions Carefully! Contains Information About Bonuses and Penalties!!!
3 pages
ABB Electric Market Segmentation Strategy
0% (1)
ABB Electric Market Segmentation Strategy
11 pages
Case Analysis When New Products and Customer Loyality Collide
No ratings yet
Case Analysis When New Products and Customer Loyality Collide
2 pages
Course Material BM QT 2019 PDF
No ratings yet
Course Material BM QT 2019 PDF
44 pages
Webinar Project CPM & EVM Analysis
No ratings yet
Webinar Project CPM & EVM Analysis
10 pages
Case Summary KPCL
No ratings yet
Case Summary KPCL
3 pages
Material
No ratings yet
Material
5 pages
Arrow Electronics: History and Challenges
No ratings yet
Arrow Electronics: History and Challenges
17 pages
Atmel's B2C Strategies for Emerging Markets
No ratings yet
Atmel's B2C Strategies for Emerging Markets
7 pages
J.C. Penney's Pricing Strategy Failures
No ratings yet
J.C. Penney's Pricing Strategy Failures
3 pages
Zomato: Evolution of Restaurant Services
100% (1)
Zomato: Evolution of Restaurant Services
29 pages
Overview of Reliance Fresh Supermarkets
No ratings yet
Overview of Reliance Fresh Supermarkets
23 pages
IPL - A Group 9 - Final
No ratings yet
IPL - A Group 9 - Final
8 pages
Example Assessment Distinction 1
No ratings yet
Example Assessment Distinction 1
36 pages
California Electricity Deregulation Crisis
No ratings yet
California Electricity Deregulation Crisis
32 pages
20 - Key Challenges For 12th Plan - Initial CII Note
No ratings yet
20 - Key Challenges For 12th Plan - Initial CII Note
14 pages
Dabbawallas Case Study
No ratings yet
Dabbawallas Case Study
6 pages
Ny DPR Decision Sheet
No ratings yet
Ny DPR Decision Sheet
3 pages
Barco Projection Systems Case Study
No ratings yet
Barco Projection Systems Case Study
1 page
Confusion Matrix in Model Evaluation
No ratings yet
Confusion Matrix in Model Evaluation
43 pages
COnfusion Matrix
No ratings yet
COnfusion Matrix
32 pages
Agamoni, Mrunmayee, Prabhuti, Shalini, Suhrit - AAMM2
No ratings yet
Agamoni, Mrunmayee, Prabhuti, Shalini, Suhrit - AAMM2
14 pages
Assignment 2: Sustainable Production
No ratings yet
Assignment 2: Sustainable Production
14 pages
Suhrit A2 P1
No ratings yet
Suhrit A2 P1
22 pages
Design Innovation in Sustainable Fashion
No ratings yet
Design Innovation in Sustainable Fashion
6 pages
Fashion Styling Course Guide
0% (1)
Fashion Styling Course Guide
5 pages
Productivity Improvement With Kaizen Tool in Garment Industry
No ratings yet
Productivity Improvement With Kaizen Tool in Garment Industry
6 pages
Sustainable Textile Materials in Interiors: July 2016
No ratings yet
Sustainable Textile Materials in Interiors: July 2016
13 pages
Uniqlo
100% (1)
Uniqlo
16 pages
PublishedCOVID-19StudentMentalHealth InterviewSurvey
No ratings yet
PublishedCOVID-19StudentMentalHealth InterviewSurvey
15 pages
Sustainable Product Design For Fashion Apparel: A Preliminary Analysis of Indian and Swedish Fashion Apparel Brands
No ratings yet
Sustainable Product Design For Fashion Apparel: A Preliminary Analysis of Indian and Swedish Fashion Apparel Brands
10 pages
Absorption and Marginal Costing Worked Examples
100% (1)
Absorption and Marginal Costing Worked Examples
5 pages
Maintenance Management: Case Study: Shahi Exports
No ratings yet
Maintenance Management: Case Study: Shahi Exports
29 pages
Impact of Worklife Balance
No ratings yet
Impact of Worklife Balance
5 pages
Michael Okpara University of Agriculture, Umudike
No ratings yet
Michael Okpara University of Agriculture, Umudike
2 pages
CHRO 100-Day Plan for International HRM
No ratings yet
CHRO 100-Day Plan for International HRM
5 pages
Art Appreciation Course Syllabus
No ratings yet
Art Appreciation Course Syllabus
12 pages
Copar - Report
No ratings yet
Copar - Report
32 pages
Audi's Big Data Analytics Strategies
No ratings yet
Audi's Big Data Analytics Strategies
4 pages
Set 2 Dmba 102
No ratings yet
Set 2 Dmba 102
5 pages
Q No1: Explain The Concept of Profession. Discuss Teaching As A Profession
No ratings yet
Q No1: Explain The Concept of Profession. Discuss Teaching As A Profession
18 pages
Quantitative Research Introduction Notes
No ratings yet
Quantitative Research Introduction Notes
9 pages
Walter Benjamin's First Philosophy Experience, Ephemerality and Truth (Nathan Ross) (Z-Library)
No ratings yet
Walter Benjamin's First Philosophy Experience, Ephemerality and Truth (Nathan Ross) (Z-Library)
151 pages
BPO Employee Silence and Work Ethics
No ratings yet
BPO Employee Silence and Work Ethics
12 pages
Class 9 Lesson 1 - Communication Skills
89% (18)
Class 9 Lesson 1 - Communication Skills
4 pages
AI in Education Technologies Presentation in Yellow Blue Lined Illustrative Style
No ratings yet
AI in Education Technologies Presentation in Yellow Blue Lined Illustrative Style
17 pages
DLL - Mathematics 3 - Q1 - W4
No ratings yet
DLL - Mathematics 3 - Q1 - W4
4 pages
Ass4 Record of Module Use IM Revised
No ratings yet
Ass4 Record of Module Use IM Revised
1 page
Detailed Lesson Plan NIKKA
No ratings yet
Detailed Lesson Plan NIKKA
6 pages
DLL CSS Week 2
No ratings yet
DLL CSS Week 2
3 pages
Pre Test and Post Test INSET
No ratings yet
Pre Test and Post Test INSET
4 pages
Defense Written
No ratings yet
Defense Written
12 pages
Legal Positivism: Austin'S Theory: Jurisprudence
No ratings yet
Legal Positivism: Austin'S Theory: Jurisprudence
6 pages
Globalization and Cultural Boundaries
No ratings yet
Globalization and Cultural Boundaries
3 pages
Fundamentals of Geomorphology Fundamentals of Physical Geography Second Edition Richar Huggett ebook full digital unlock
100% (4)
Fundamentals of Geomorphology Fundamentals of Physical Geography Second Edition Richar Huggett ebook full digital unlock
165 pages
Rural-Urban Conflicts and Opportunities: Marc Antrop
No ratings yet
Rural-Urban Conflicts and Opportunities: Marc Antrop
9 pages
Interdisciplinary Contextualization For Mathematics Education
No ratings yet
Interdisciplinary Contextualization For Mathematics Education
34 pages
10.1515 9783110851021-Toc
No ratings yet
10.1515 9783110851021-Toc
4 pages
Understanding Semantic Network Models
No ratings yet
Understanding Semantic Network Models
9 pages
Semester-End (Odd) Theory (RegularCarryover) Examinations, Session - 2025-26
No ratings yet
Semester-End (Odd) Theory (RegularCarryover) Examinations, Session - 2025-26
3 pages
Service Science and Management Overview
No ratings yet
Service Science and Management Overview
21 pages
Mat135 Syllabus f22
No ratings yet
Mat135 Syllabus f22
8 pages
Hacklin Wallin (2013) Convergence and Interdisciplinarity in Innovation Management
No ratings yet
Hacklin Wallin (2013) Convergence and Interdisciplinarity in Innovation Management
16 pages

Confusion Matrix: For Evaluating The KNN Model

Uploaded by

Confusion Matrix: For Evaluating The KNN Model

Uploaded by

Confusion Matrix

for evaluating the Knn model

• Trained on labelled data

• It is a nearest neighbour Algorithm

• Prediction of new data is done based on feature similarity with the

• Also called feature similarly algorithm

• We should choose a right ‘K’….usually sqrt(n)

• Amazon uses it to recommend books to the new customers

• Bank uses it to approve loan

• Doctors use it to predict diabetics

• Predicting risk factor of prostrate cancer

Step -2 Prepare and explore data

Step 3: Normalizing numeric data

Step 4: Creating training and test data set

Step 5: Training a model on data

Step 6: Evaluate the model performance….Using Confusion Matrix

In all there can be 4 possible outcomes:

We can have binary matrix with two levels or more.

Example: Confusion Matrix (CM)....

• At random sample tests (say, test 1 and test 2)

Actual Positive Actual Negative

Predicted Negative FN 5 TN 395

• At random sample tests performed on500 men ,

5. Specificity (true negatives / all actual negatives) =TN / TN + FP

conf_matrix <- confusionMatrix(data = m1,factor(fibre_test_target))

Assign the output of CM to a variable

Print confusion matrix to evaluate the model based on the statistics

You might also like