0% found this document useful (0 votes)

38 views

12 ML KNN

K Nearest Neighbors (KNN) is a simple machine learning algorithm that classifies new data points based on their similarity to existing data points in the training set. It considers the k nearest neighbors of the new point and assigns the most common class among those neighbors as the prediction. Choosing the right value for k is important, as too small a value risks overfitting to noise and too large a value may include other classes. KNN is among the simplest algorithms and is useful for classification problems in domains like medicine, news, and banking.

Uploaded by

In Tech

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views

12 ML KNN

Uploaded by

In Tech

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

K Nearest Neighbors

Introduction To Machine Learning

Dr. Hikmat Ullah Khan
KNN - Definition

Idea:
 k-NN stands for k-Nearest neighbour
 k-NN is a simple algorithm
 Classifies new cases based on a
similarity measure of new case to the
already instances having class labels
KNN – different names
• K-Nearest Neighbors
• Considers k neighbors only
• Memory-Based Reasoning
• Required data for computation
• Instance-Based Learning
• Example-Based Reasoning
• Case-Based Reasoning
• Takes existing examples into account, considers test
instance and computes result
• Lazy Learning
• All computation deferred until classification decision
WHY NEAREST NEIGHBOR?

 Used to classify objects based on closest training

examples in the feature space
 Top 10 Data Mining Algorithm
 A very good for start up students
 ICDM paper – December 2007

 Among the simplest of all Data Mining Algorithms

 Classification Method

? 4
Instance-Based Classifiers
Set of Stored Cases • Store the training records

……... • Use training records to

Atr1 AtrN Class
predict the class label of
A unseen cases
B
B
Unseen Case
C
Atr1 ……... AtrN
A
C
B
Nearest Neighbor Classifiers
 Basic idea:
 If it swims like a duck, quacks like a duck, then it’s probably a
duck

Compute
Distance Test
Record

Training Choose k of the

Records “nearest” records
KNN – Number of Neighbors
 If K=1,
 select the nearest neighbor
 If K>1,
 For classification select based on k
neighbors.
Nearest-Neighbor Classifiers
Unknown record
 Requires three things
– The set of stored
records
– Distance Metric to
compute distance
between records
– The value of k, the
number of nearest
neighbors to retrieve
Nearest-Neighbor Classifiers
Unknown record  To classify an unknown
record:
– Compute distance to
other training records
– Identify k nearest
neighbors
– Use class labels of
nearest neighbors to
determine the class label
of unknown record (e.g.,
by taking majority vote)
Definition of Nearest Neighbor

X X X

(a) 1-nearest neighbor (b) 2-nearest neighbor (c) 3-nearest neighbor

K-nearest neighbors of a record x are data points

that have the k smallest distance to x
k NEAREST NEIGHBOR

 k = 1:
 Belongs to square class

 k = 3:
?  Belongs to triangle class

 k = 7:
 Belongs to square class

12
8
k NEAREST NEIGHBOR

 k = 1:
 Belongs to square class

 k = 3:
?
 Belongs to triangle class

 k = 7:
 Belongs to square class

 Choosing the value of k:

 If k is too small, sensitive to noise points
 If k is too large, neighborhood may include
points from other classes
13
8
 Choose an odd value for k, to eliminate ties
KNN Classification

$250,000

$200,000

$150,000

Loan$ Non-Default
$100,000 Default

$50,000

$0
0 10 20 30 40 50 60 70
Age
KNN Classification – Distance
Age Loan Default Distance
25 $40,000 N 102000
35 $60,000 N 82000
45 $80,000 N 62000
20 $20,000 N 122000
35 $120,000 N 22000
52 $18,000 N 124000
23 $95,000 Y 47000
40 $62,000 Y 80000
60 $100,000 Y 42000
48 $220,000 Y 78000
33 $150,000 Y 8000

48 $142,000 ?

D  ( x1  x2 )  ( y1  y2 )
2 2
KNN Classification – Standardized
Distance
Age Loan Default Distance
0.125 0.11 N 0.7652
0.375 0.21 N 0.5200
0.625 0.31 N 0.3160
0 0.01 N 0.9245
0.375 0.50 N 0.3428
0.8 0.00 N 0.6220
0.075 0.38 Y 0.6669
0.5 0.22 Y 0.4437
1 0.41 Y 0.3650
0.7 1.00 Y 0.3861
0.325 0.65 Y 0.3771

0.7 0.61 ?
X  Min
Xs 
Max  Min
Nearest Neighbor Classification
 Compute distance between two points:
 Euclidean distance

d ( p, q )   ( pi
i
q ) i
2

 Determine the class from nearest neighbor list

 take the majority vote of class labels among the k-nearest
neighbors
k NEAREST NEIGHBOR
 Common Distance Metrics:
 Euclidean distance(continuos distribution)
d(p,q) = √∑(pi – qi)2

 Hamming distance (overlap metric)

bat (distance = 1) toned (distance = 3)
cat roses

 Discrete Metric(boolean metric)

if x = y then d(x,y) = 0. Otherwise, d(x,y) = 1

18
7
Example
(Test: Durability:3, Strength:7, Class;?)

Type No Item Item Class

Durability Strength
Type-1 7 7 Bad

Type-2 7 4 Bad

Type-3 3 4 Good

Type-4 1 4 Good
Example
Type No Item Item Class Distance
Durabilit Streng
y th
Type-1 7 7 Bad Sqrt((7-3)2 + (7-
7)2) = 4
Type-2 7 4 Bad

Type-3 3 4 Good

Type-4 1 4 Good
Type No Item Item Class Distance
Durabilit Strengt
y h

Type-1 7 7 Bad Sqrt((7-

3)2 + (7-
7)2) = 4

Type-2 7 4 Bad 5

Type-3 3 4 Good 3

Type-4 1 4 Good 3.6

Type No Item Item Class Distance Rank
Dura Stren
bility gth

Type-1 7 7 Bad Sqrt((7-3)2 3

+ (7-7)2) =
4
Type-2 7 4 Bad 5 4

Type-3 3 4 Good 3 1

Type-4 1 4 Good 3.6 2

k NEAREST NEIGHBOR
 Accuracy of all NN based algorithms depends on a data
model.
 Scaling issues
 Attributes may have to be scaled to prevent
distance measures from being dominated by one
of the attributes.
 Examples
 Height of a person may vary from 4’ to 6’
 Weight of a person may vary from 100lbs to 300lbs
 Income of a person may vary from $10k to $500k

23
9
Distance – Categorical Variables

X Y Distance
Male Male 0
Male Female 1

x yD0
x  y  D 1
KNN - Applications
• Classification
– legal,
– medical,
– news,
– banking
• Problem-solving
– planning,
– pronunciation
• Teaching and aiding
– help desk,
– user training
Merits and Demerits
 Advantages
 Simple technique that is easily implemented
 Can work with relatively little information
 Well suited for multi classes problems
 Learning is simple (does not involve preprocessing )
 Performs best in some cases (gene and protein identification)

 Dis-advantages
 Memory issues and expensive computation for large datasets
 Low Accuracy if presence of noisy or irrelevant features
 Feature selection problem
Exercise: KNN using R
 KNN tutorial using R language
 https://www.youtube.com/watch?v=lDCWX6vCLFA

 All steps
 UCI dataset
 WDBC data
 U use any data you like
 Use python or any other tool

27
9

6 - KNN Classifier
No ratings yet
6 - KNN Classifier
10 pages
K Nearest Neighbor - Step by Step Tutorial
No ratings yet
K Nearest Neighbor - Step by Step Tutorial
16 pages
CYF121S F221S Manual
No ratings yet
CYF121S F221S Manual
21 pages
Supervised Example KNN
No ratings yet
Supervised Example KNN
22 pages
Introduction To KNN
100% (1)
Introduction To KNN
8 pages
KNN Presentation
No ratings yet
KNN Presentation
16 pages
Algorithms - K Nearest Neighbors
No ratings yet
Algorithms - K Nearest Neighbors
23 pages
KNN v2
No ratings yet
KNN v2
31 pages
KNN
No ratings yet
KNN
10 pages
ML-Lecture-13-KNN
No ratings yet
ML-Lecture-13-KNN
14 pages
CPE412 Pattern Recognition (Week 6)
No ratings yet
CPE412 Pattern Recognition (Week 6)
27 pages
KNN Algorithm
No ratings yet
KNN Algorithm
16 pages
WEEK 07
No ratings yet
WEEK 07
24 pages
KNN Dan KMeans
No ratings yet
KNN Dan KMeans
37 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
KNN_Algorithm
No ratings yet
KNN_Algorithm
2 pages
Lecture-11-KNearest Clustering-Part-1
No ratings yet
Lecture-11-KNearest Clustering-Part-1
18 pages
K- Nearest Neighbor
No ratings yet
K- Nearest Neighbor
13 pages
K-Nearest Neighbor Classifier: This Slide Is Modified From Dr. Tan's Slides. Thanks To Dr. Tan
No ratings yet
K-Nearest Neighbor Classifier: This Slide Is Modified From Dr. Tan's Slides. Thanks To Dr. Tan
11 pages
KNN Algorithm
No ratings yet
KNN Algorithm
11 pages
Nearest-Neighbor Classifier: MTL 782 Iit Delhi
No ratings yet
Nearest-Neighbor Classifier: MTL 782 Iit Delhi
16 pages
5. K-Nearest Neighbors
No ratings yet
5. K-Nearest Neighbors
35 pages
Lecture 14 and 15
No ratings yet
Lecture 14 and 15
42 pages
K Nearest Neighbor KNN
No ratings yet
K Nearest Neighbor KNN
18 pages
K Nearestneighborknnalgorithm 241117075907 d767c46d
No ratings yet
K Nearestneighborknnalgorithm 241117075907 d767c46d
13 pages
COS4852 2023 Unit 2 - KNN
No ratings yet
COS4852 2023 Unit 2 - KNN
10 pages
Materi 7.2. K-NN
No ratings yet
Materi 7.2. K-NN
6 pages
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
No ratings yet
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
18 pages
Lecture8 KNN1
No ratings yet
Lecture8 KNN1
16 pages
Lecture Note #3_PEC-CS701E
No ratings yet
Lecture Note #3_PEC-CS701E
27 pages
KNN Algorithm
No ratings yet
KNN Algorithm
3 pages
KNN
100% (1)
KNN
7 pages
E Learning KNN
No ratings yet
E Learning KNN
31 pages
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
20 KNN Presentation
No ratings yet
20 KNN Presentation
16 pages
K Nearest Neighbors (KNN) : "Birds of A Feather Flock Together"
No ratings yet
K Nearest Neighbors (KNN) : "Birds of A Feather Flock Together"
16 pages
K-Nearest Neighbor Classification-Algorithm and Characteristics
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
6 pages
Part A 3. KNN Classification
No ratings yet
Part A 3. KNN Classification
35 pages
3. Classification (K-Nearest Neighbor)
No ratings yet
3. Classification (K-Nearest Neighbor)
22 pages
KNN - Feb 19
No ratings yet
KNN - Feb 19
42 pages
K-Nearest Neighbor Learning
No ratings yet
K-Nearest Neighbor Learning
19 pages
Instance Based Learning
No ratings yet
Instance Based Learning
16 pages
Supervised Learning and K Nearest Neighbors: Business Intelligence For Managers
No ratings yet
Supervised Learning and K Nearest Neighbors: Business Intelligence For Managers
15 pages
K-Nearest Neighbour (KNN)
No ratings yet
K-Nearest Neighbour (KNN)
14 pages
KNN With Example (2)
No ratings yet
KNN With Example (2)
21 pages
SUMSEM-2020-21 MEE6070 ETH VL2020210700842 Reference Material I 16-Jul-2021 K-Nearest Neighbors (KNN) Algorithm (Repaired) Week-3
No ratings yet
SUMSEM-2020-21 MEE6070 ETH VL2020210700842 Reference Material I 16-Jul-2021 K-Nearest Neighbors (KNN) Algorithm (Repaired) Week-3
40 pages
KNN Updated
No ratings yet
KNN Updated
30 pages
14 a i
No ratings yet
14 a i
14 pages
14 K - Nearest Neighbours
No ratings yet
14 K - Nearest Neighbours
8 pages
1694600817-Unit2.3 KNN CU 2.0
No ratings yet
1694600817-Unit2.3 KNN CU 2.0
25 pages
STAT 479: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
No ratings yet
STAT 479: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
23 pages
ML-KN
No ratings yet
ML-KN
12 pages
Algorithms: K Nearest Neighbors
No ratings yet
Algorithms: K Nearest Neighbors
16 pages
KNN
No ratings yet
KNN
7 pages
01 Basics 02knn 01
No ratings yet
01 Basics 02knn 01
7 pages
02-knn Notes
No ratings yet
02-knn Notes
23 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
22 pages
Seminar Report File On KNN Models: University Institute of Engineering and Technology, Kurukshetra University
No ratings yet
Seminar Report File On KNN Models: University Institute of Engineering and Technology, Kurukshetra University
24 pages
Unit V: Distance and Rule Based Models
No ratings yet
Unit V: Distance and Rule Based Models
56 pages
STAT 451: Introduction To Machine Learning Lecture Notes
No ratings yet
STAT 451: Introduction To Machine Learning Lecture Notes
22 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
L-032 L2 Samyuktha Mandampully PPS Experiment 1 PDF Algorithms Computer Programming 4
No ratings yet
L-032 L2 Samyuktha Mandampully PPS Experiment 1 PDF Algorithms Computer Programming 4
1 page
Use of Computer For A Geologist
No ratings yet
Use of Computer For A Geologist
15 pages
Partner University Access Instructions 2021
No ratings yet
Partner University Access Instructions 2021
4 pages
Reorganize Oracle Tables To Reduce Disk IO
No ratings yet
Reorganize Oracle Tables To Reduce Disk IO
4 pages
Speed Control of DC Motor Using Microcontroller
100% (2)
Speed Control of DC Motor Using Microcontroller
5 pages
Idempotent
No ratings yet
Idempotent
7 pages
Generate content with the Gemini Enterprise API _ Generative AI on Vertex AI _ Google Cloud
No ratings yet
Generate content with the Gemini Enterprise API _ Generative AI on Vertex AI _ Google Cloud
24 pages
CST201 - KQB KtuQbank
No ratings yet
CST201 - KQB KtuQbank
11 pages
Online Safety Security Ethics and Etiquette 1
No ratings yet
Online Safety Security Ethics and Etiquette 1
50 pages
0 - College Status Report Volume I Mar 2016
No ratings yet
0 - College Status Report Volume I Mar 2016
84 pages
G100 (0.4 7.5kW) Full User Manual EN V1.2 200911
No ratings yet
G100 (0.4 7.5kW) Full User Manual EN V1.2 200911
335 pages
Vibration Measurement System With Accelerometer Sensor Based On ARM PDF
No ratings yet
Vibration Measurement System With Accelerometer Sensor Based On ARM PDF
5 pages
Oracle-Fusion-Financials Sample Resumes-1-1
No ratings yet
Oracle-Fusion-Financials Sample Resumes-1-1
7 pages
E22413-G-5
No ratings yet
E22413-G-5
26 pages
Applications of Iot in Maintenance Management
No ratings yet
Applications of Iot in Maintenance Management
29 pages
Brochure OpenRoads Designer
No ratings yet
Brochure OpenRoads Designer
4 pages
MODULE 1 - Factoring Polynomials (Part I)
No ratings yet
MODULE 1 - Factoring Polynomials (Part I)
12 pages
03 Relational Model and ER Diagrams
No ratings yet
03 Relational Model and ER Diagrams
139 pages
Presentation On Stack Data Structure and Its Applications: Name Department Roll No
No ratings yet
Presentation On Stack Data Structure and Its Applications: Name Department Roll No
10 pages
Brilliance CT: 6/10/16-Slice Configuration (Air)
No ratings yet
Brilliance CT: 6/10/16-Slice Configuration (Air)
136 pages
5130 XpressMusic RM-495 Schematics v1 0 PDF
No ratings yet
5130 XpressMusic RM-495 Schematics v1 0 PDF
0 pages
Andhra Pradesh Public Service Commission: Vijayawada NOTIFICATION NO.29/2018, Dt.31/12/2018 Deputy Executive Information Engineers in A.P. Information Service (General Recruitment) para - 1
No ratings yet
Andhra Pradesh Public Service Commission: Vijayawada NOTIFICATION NO.29/2018, Dt.31/12/2018 Deputy Executive Information Engineers in A.P. Information Service (General Recruitment) para - 1
20 pages
Kenya Methodist University: Endof3 Trimester 2018 (FT) Examination
100% (1)
Kenya Methodist University: Endof3 Trimester 2018 (FT) Examination
2 pages
Google Cloud Automotive Industry Narrative (2)
No ratings yet
Google Cloud Automotive Industry Narrative (2)
11 pages
Insights Taken From Semi Detailed Lesson Plan Lesson Plan Consultation
No ratings yet
Insights Taken From Semi Detailed Lesson Plan Lesson Plan Consultation
5 pages
Glykas 2010 Fuzzy Cognitive Maps
No ratings yet
Glykas 2010 Fuzzy Cognitive Maps
435 pages
Oper 441 Tutorial 01 Sep201
No ratings yet
Oper 441 Tutorial 01 Sep201
2 pages
Art Subject For Elementary - Visual Arts - by Slidesgo
No ratings yet
Art Subject For Elementary - Visual Arts - by Slidesgo
63 pages
PL/SQL Course Content (20 Hrs 7 Sessions) : Blocks
No ratings yet
PL/SQL Course Content (20 Hrs 7 Sessions) : Blocks
4 pages

12 ML KNN

Uploaded by

12 ML KNN

Uploaded by

K Nearest Neighbors

Introduction To Machine Learning

 Used to classify objects based on closest training

 Among the simplest of all Data Mining Algorithms

……... • Use training records to

Training Choose k of the

(a) 1-nearest neighbor (b) 2-nearest neighbor (c) 3-nearest neighbor

K-nearest neighbors of a record x are data points

 Choosing the value of k:

 Determine the class from nearest neighbor list

 Hamming distance (overlap metric)

 Discrete Metric(boolean metric)

Type No Item Item Class

Type-1 7 7 Bad Sqrt((7-

Type-4 1 4 Good 3.6

Type-1 7 7 Bad Sqrt((7-3)2 3

Type-4 1 4 Good 3.6 2

You might also like