UCS 401 Unit-lll Lect 13 Distance Based Models Neighbours and Examples

Uploaded by

buest21ucs028

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

UCS 401 Unit-lll Lect 13 Distance Based Models Neighbours and Examples

Uploaded by

buest21ucs028

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

ML UCS-401

Topics: Distance Based Models Neighbours and

Examples, Nearest Neighbours Classification,
Distance based clustering-K means Algorithm,
Hierarchical clustering
Distance Based Models Neighbours and Examples

Distance-based models in machine learning rely on

the concept of measuring the distance or similarity
between data points to make predictions, cluster
data, or retrieve relevant examples. These models
assume that similar data points are close to each
other in the feature space.
Distance-Based Models:
1. k-Nearest Neighbors (k-NN): A non-parametric
classification and regression algorithm.
•Predicts based on the labels or values of the
nearest k points in the feature space.
•Common distance metrics:
•Euclidean distance
•Manhattan distance
•Minkowski distance
•Cosine similarity
2. Radius Neighbors: Similar to k-NN but uses all
neighbors within a predefined radius for predictions
instead of a fixed k.
3. Support Vector Machines (SVM) with RBF
Kernel: Uses distance-based measures in its kernel
functions (e.g., Radial Basis Function) to find
optimal decision boundaries.
4. Clustering Algorithms:
•K-Means: Clusters points by minimizing within-
cluster distances to centroids.
•DBSCAN: Groups points based on density and
distance.
•Agglomerative Clustering: Forms clusters by
merging points or clusters based on linkage
distance.
5. Self-Organizing Maps (SOM): An unsupervised
learning method using distance-based mapping to
reduce dimensions and visualize data.
6. Instance-Based Learning Algorithms: Learning
by memorizing instances and using distance
metrics for prediction (e.g., Locally Weighted
Regression).
Examples of Applications:
1. k-NN in Image Recognition
Classifying images based on the majority label
of k nearest images in feature space
2. Recommendation Systems:Using cosine similarity
or Euclidean distance to recommend similar items.
3. Customer Segmentation with K-Means: Grouping
customers based on purchasing behavior or
demographics.
4. Anomaly Detection with DBSCAN:Identifying
outliers as points with low-density neighborhoods.
5. RBF Kernel in SVM for Classification:Mapping non-
linear data into a higher dimension using distance
metrics for better separation.
Nearest Neighbours Classification
Nearest Neighbors Classification is a simple,
instance-based learning algorithm that classifies a
data point based on the majority label of its closest
neighbors in the feature space. It assumes that
similar data points are near each other.
How it Works:
1. Training Phase:There is no explicit training; the
algorithm simply stores the dataset.
2. Prediction Phase:
For a new data point:
•Compute the distance between the point and all
points in the training data.
•Identify the k closest neighbors (k = a predefined
number).
•Determine the majority class label among these
neighbors.
•Assign the majority class label to the new point.
Advantages
•Simple to understand and implement.
•No explicit training phase; efficient for small datasets.
•Non-parametric: No assumptions about data
distribution.
Disadvantages:
•Computationally expensive for large datasets
(requires computation of distances for all points).
•Sensitive to irrelevant or redundant features.
•Choice of kk and distance metric can greatly affect
performance.
•Struggles with imbalanced datasets, as minority
classes may get outvoted.
Applications:
1. Image Classification: Classifying images based
on pixel-level features.
2. Text Categorization: Labeling text documents
using distance metrics in vectorized text space.
3. Anomaly Detection: Identifying outliers by
looking for points without enough neighbors in a
radius.
4. Medical Diagnosis: Predicting diseases based on
symptoms and historical cases
Distance based clustering-K means Algorithm

K-Means is a popular distance-based clustering

algorithm used for partitioning a dataset
into kk distinct clusters. It minimizes the variance
within each cluster by iteratively updating cluster
centroids and assigning data points to the nearest
centroid.
How K-Means Works
1. Initialization:
•Select k, the number of clusters.
•Randomly initialize k centroids (cluster centers)
2. Iteration:
•Assign each data point to the nearest centroid
using a distance metric, typically Euclidean
distance.
•Compute new centroids by taking the mean of all
points assigned to each cluster.
3. Convergence: Repeat the assignment and update
steps until the centroids do not change significantly
or a maximum number of iterations is reached.
Advantages
•Simple to implement and computationally efficient.
•Scales well to large datasets.
•Works well for spherical clusters with similar sizes.
Disadvantages
•Requires predefining k, the number of clusters.
•Sensitive to initialization (different initial centroids can
lead to different results).
•Assumes clusters are spherical and evenly sized.
•Struggles with non-convex clusters or datasets with
varying densities.
Applications
1. Customer Segmentation: Grouping customers
based on purchasing behavior.
2. Image Compression: Quantizing colors in an
image.
3. Document Clustering: Grouping documents
with similar topics.
4. Anomaly Detection: Identifying data points
that do not fit any cluster.
Hierarchical clustering
Hierarchical clustering is an unsupervised machine
learning algorithm that groups data points into
clusters based on their similarity. It creates a
hierarchy or tree-like structure (dendrogram) to
represent the nested grouping of data at different
levels of granularity.
Types of Hierarchical Clustering
1. Agglomerative (Bottom-Up):Starts with each
data point as its own cluster.Iteratively merges the
closest clusters until a single cluster remains.
•2. Divisive (Top-Down):
•Starts with all data points in one cluster.
•Recursively splits clusters into smaller clusters until
each data point is its own cluster.
Advantages
•Does not require predefining the number of clusters
(k).
•Can capture hierarchical relationships among data
points.
•Works well for data with arbitrary shapes.
Disadvantages
•Computationally expensive for large datasets (O(n^2)).
•Sensitive to noise and outliers.
•Requires interpretation of the dendrogram to decide the
number of clusters.
Applications
1. Document Clustering: Grouping text documents with similar
content.
2. Gene Expression Analysis: Clustering genes with similar
expression profiles.
3. Market Segmentation: Identifying customer segments based
on purchasing behavior.
4. Social Network Analysis: Detecting communities or groups.

Moore & Mealy Machine
94% (16)
Moore & Mealy Machine
29 pages
MLT Unit 3 Notes
No ratings yet
MLT Unit 3 Notes
19 pages
2021 Clustering
No ratings yet
2021 Clustering
50 pages
ML_lecture14
No ratings yet
ML_lecture14
17 pages
Medical Imabmnge Analysis
No ratings yet
Medical Imabmnge Analysis
41 pages
Lec09 Clustering
No ratings yet
Lec09 Clustering
27 pages
UNIT-2 ML notes
No ratings yet
UNIT-2 ML notes
15 pages
K-Means Clustering and K-Nearest Neighbors Algorithm
No ratings yet
K-Means Clustering and K-Nearest Neighbors Algorithm
62 pages
Unit 5 - DA - Classification & Clustering
No ratings yet
Unit 5 - DA - Classification & Clustering
105 pages
1. Clustering
No ratings yet
1. Clustering
75 pages
Distance Based Models
No ratings yet
Distance Based Models
19 pages
Unit 4 Introduction to Algorithm
No ratings yet
Unit 4 Introduction to Algorithm
10 pages
Chapter 2
No ratings yet
Chapter 2
26 pages
Introduction To Data Science Unsupervised Learning: CS 194 Fall 2015 John Canny
No ratings yet
Introduction To Data Science Unsupervised Learning: CS 194 Fall 2015 John Canny
54 pages
IS4242 W8 Similarity, NN and Clusters
No ratings yet
IS4242 W8 Similarity, NN and Clusters
29 pages
Clustering Algorithm
No ratings yet
Clustering Algorithm
17 pages
Distance-Based Methods - KNN
No ratings yet
Distance-Based Methods - KNN
8 pages
UNIT - 4 DWDM
No ratings yet
UNIT - 4 DWDM
27 pages
6 - Into To Data Science Techniques and Clustering
No ratings yet
6 - Into To Data Science Techniques and Clustering
16 pages
Clustering
No ratings yet
Clustering
65 pages
UNIT 3 ML Distance Based Learning
No ratings yet
UNIT 3 ML Distance Based Learning
19 pages
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
No ratings yet
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
20 pages
ML ModuleUntitled 2
No ratings yet
ML ModuleUntitled 2
8 pages
ML Module 4 2022 1 PDF
No ratings yet
ML Module 4 2022 1 PDF
31 pages
STAT 479: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
No ratings yet
STAT 479: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
23 pages
kmeansfinal
No ratings yet
kmeansfinal
16 pages
ML+Clustering
No ratings yet
ML+Clustering
33 pages
ML IMP QUES 2
No ratings yet
ML IMP QUES 2
37 pages
Clustering
No ratings yet
Clustering
75 pages
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
No ratings yet
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
20 pages
Lecture 14 and 15
No ratings yet
Lecture 14 and 15
42 pages
Ml 7th Sem Aiml Ite Notes Complete Long[1]-63-155
No ratings yet
Ml 7th Sem Aiml Ite Notes Complete Long[1]-63-155
93 pages
Unit 4 Machine Learning
No ratings yet
Unit 4 Machine Learning
12 pages
STAT 451: Introduction To Machine Learning Lecture Notes
No ratings yet
STAT 451: Introduction To Machine Learning Lecture Notes
22 pages
4 Clustering
No ratings yet
4 Clustering
9 pages
Chapter 3 Unsupervised Learning
No ratings yet
Chapter 3 Unsupervised Learning
45 pages
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
No ratings yet
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
48 pages
Nearset Clustering
No ratings yet
Nearset Clustering
15 pages
6.nsupervised Learning Clustering Lecture 7 Slides For4962
No ratings yet
6.nsupervised Learning Clustering Lecture 7 Slides For4962
37 pages
Clustering-Part1.pptx
No ratings yet
Clustering-Part1.pptx
84 pages
Unit 2
No ratings yet
Unit 2
33 pages
K means algorithm
No ratings yet
K means algorithm
4 pages
Clustering
No ratings yet
Clustering
84 pages
Week 07 Lecture Material
No ratings yet
Week 07 Lecture Material
49 pages
APznzab0G8iLD5cDfn798Gn-fXshRpam8ullbf6ZS5Hd4l0BEcKNHy9gDG24DS66RfgvnKXAQjMAivMmmi5cmDWF9tqOaPMy3afuzafCU1kpG1xfQIr7b98q406ZWiqt50nL8WhMI6azoYzWSgf7c7khnqww3VlQ9I90ROmc0QL4DbmipYYoLleGYR6TO4UYmc_PsaQB5v0XmLUwPEub3QuwGdUnUEr2dp_hV4bds0MuRbpJ
No ratings yet
APznzab0G8iLD5cDfn798Gn-fXshRpam8ullbf6ZS5Hd4l0BEcKNHy9gDG24DS66RfgvnKXAQjMAivMmmi5cmDWF9tqOaPMy3afuzafCU1kpG1xfQIr7b98q406ZWiqt50nL8WhMI6azoYzWSgf7c7khnqww3VlQ9I90ROmc0QL4DbmipYYoLleGYR6TO4UYmc_PsaQB5v0XmLUwPEub3QuwGdUnUEr2dp_hV4bds0MuRbpJ
34 pages
Machine Learning
No ratings yet
Machine Learning
15 pages
Topic4 Clustering
No ratings yet
Topic4 Clustering
78 pages
Clustering Algorithm: An Unsupervised Learning Approach
No ratings yet
Clustering Algorithm: An Unsupervised Learning Approach
23 pages
KNN VS Kmeans
No ratings yet
KNN VS Kmeans
3 pages
ML UNIT-5
No ratings yet
ML UNIT-5
31 pages
Aiml 5th Module Part2
No ratings yet
Aiml 5th Module Part2
28 pages
Clustering
No ratings yet
Clustering
75 pages
datamining-lect8
No ratings yet
datamining-lect8
79 pages
STAT 451: Introduction To Machine Learning Lecture Notes
No ratings yet
STAT 451: Introduction To Machine Learning Lecture Notes
22 pages
02-knn Notes
No ratings yet
02-knn Notes
23 pages
ML04_KNN-SVM_2024-2025
No ratings yet
ML04_KNN-SVM_2024-2025
57 pages
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
No ratings yet
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
33 pages
M4 - Clustering
No ratings yet
M4 - Clustering
43 pages
5. K-Nearest Neighbors
No ratings yet
5. K-Nearest Neighbors
35 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
The Art Of Data Science 1st Roger D Peng Elizabeth Matsui download
No ratings yet
The Art Of Data Science 1st Roger D Peng Elizabeth Matsui download
49 pages
Functional Math Assignment 1 Part 1 Take Home Spring 2024
No ratings yet
Functional Math Assignment 1 Part 1 Take Home Spring 2024
7 pages
MTech SP Projects
No ratings yet
MTech SP Projects
4 pages
MNM31 1main PDF
No ratings yet
MNM31 1main PDF
3 pages
A New Meta-Heuristic Bat Inspired Classification Approach For Microarray Data
No ratings yet
A New Meta-Heuristic Bat Inspired Classification Approach For Microarray Data
5 pages
KNN Algorithm
No ratings yet
KNN Algorithm
4 pages
EE1101: Signals and Systems Tutorial 3 Solutions
No ratings yet
EE1101: Signals and Systems Tutorial 3 Solutions
19 pages
SM2016 MSC
No ratings yet
SM2016 MSC
1 page
Machine Learning A-Z - Become Kaggle Master (AvaxHome)
No ratings yet
Machine Learning A-Z - Become Kaggle Master (AvaxHome)
4 pages
Causal Loop Diagram
No ratings yet
Causal Loop Diagram
4 pages
HWK 1
No ratings yet
HWK 1
2 pages
Unit 4 Cryptography
No ratings yet
Unit 4 Cryptography
106 pages
Adding Swing
No ratings yet
Adding Swing
3 pages
Electrical Power and Energy Systems: Hamed Shakouri G., Hamid Reza Radmanesh
No ratings yet
Electrical Power and Energy Systems: Hamed Shakouri G., Hamid Reza Radmanesh
11 pages
August 2022
No ratings yet
August 2022
1 page
Association Rules Overview
No ratings yet
Association Rules Overview
23 pages
Probability and Probability Distribution
No ratings yet
Probability and Probability Distribution
100 pages
Basic Combinatorics For Probability: Guy Lebanon
No ratings yet
Basic Combinatorics For Probability: Guy Lebanon
2 pages
2, Combinational Logic Circuits
No ratings yet
2, Combinational Logic Circuits
31 pages
GTU DOM Paper 4
No ratings yet
GTU DOM Paper 4
2 pages
EVALUATION PPT
No ratings yet
EVALUATION PPT
25 pages
BUET IUPC 2022 Editorial
100% (2)
BUET IUPC 2022 Editorial
5 pages
Backward Euler
No ratings yet
Backward Euler
2 pages
Activity 3 Numerical
No ratings yet
Activity 3 Numerical
6 pages
Answer - Alex Rodriguez Case
No ratings yet
Answer - Alex Rodriguez Case
8 pages
WTT Lecture 4 - Pipes in Series and Parallel
No ratings yet
WTT Lecture 4 - Pipes in Series and Parallel
47 pages
Model of Internetworking Security
No ratings yet
Model of Internetworking Security
11 pages
Python
No ratings yet
Python
23 pages
Road Accident Prediction Journal Paper
No ratings yet
Road Accident Prediction Journal Paper
3 pages

UCS 401 Unit-lll Lect 13 Distance Based Models Neighbours and Examples

Uploaded by

UCS 401 Unit-lll Lect 13 Distance Based Models Neighbours and Examples

Uploaded by

ML UCS-401

Topics: Distance Based Models Neighbours and

Distance-based models in machine learning rely on

K-Means is a popular distance-based clustering

You might also like