SUPPORT VECTOR
MACHINES (SVM)
A POWERFUL ALGORITHM FOR CLASSIFICATION
by
[Link]
CSE-3rd Year
How does detect spam?
GOAL OF SVM:
The goal of the SVM algorithm is to create the best line
or decision boundary that can segregate n-dimensional
space into classes so that we can easily put the new
data point in the correct category in the future.
INTRODUCTION
• SVM is one of the most popular Supervised Learning
algorithms.
• It is used for Classification as well as Regression
problems.
• However, primarily, it is used for Classification
problems in Machine Learning.
KEY CONCEPTS
• Hyperplane: Decision boundary between classes.
• Margin: Space between hyperplane and support
vectors.
• Support Vectors: Points that define the margin.
KEY TERMS:
• Hyperplane: The SVM algorithm helps to find the
best line or decision boundary; this best boundary or
region is called as a hyperplane.
KEY TERMS:
• Support vectors: SVM algorithm finds the
closest point of the lines from both the classes. These
points are called support vectors.
KEY TERMS:
• Margin: The distance between the vectors and the
hyperplane is called as margin.
TYPES OF SVM:
[Link] SVM 2. Non-linear SVM
TYPES OF SVM
SVM can be of two types:
1. Linear SVM: Linear SVM is used for linearly separable
data, which means if a dataset can be classified into two
classes by using a single such data is termed as linearly
separable data, and classifier is used called as linear classifier
[Link]-linear SVM: Non-Linear SVM is used for non-
linearly separated data, which means if a dataset cannot be
classified by using a line, then such data is termed as non-
linear data and classifier used is called as Non-linear SVM
LINEAR SVM:
• The working of the SVM algorithm can be understood by
using an example. Suppose we have a dataset that has
two tags (green and dataset blue), and has two features
x1 and x2. We want a classifier that can classify the
pair(x1, x2) of coordinates in either green or blue.
• So as it is 2-d space so by just using a straight line, we
can easily separate these two classes. But there can be
multiple lines that can separate these classes.
NON-LINEAR SVM:
If data is linearly arranged, then we can separate it by
using a straight line, but for non-linear data, we cannot
draw a single straight line.
So to separate these data points, we need to add one
more dimension. For linear data, we have used two
dimensions x and y, so for non-linear data, we will add a
third dimension z. It can be calculated as: z= x2 + y2
APPLICATIONS
1. Medical Diagnosis:
• Use: Classifying diseases (e.g., cancer detection,
diabetes prediction).
• Example: Identifying whether a tumor is benign or
malignant using patient data.
2. Email Spam Detection:
• Use: Classifying emails as spam or not spam.
• Example: Gmail and Outlook use machine learning
models like SVMs to filter messages.
[Link] expression classification:
PROS AND CONS
• Pros:
• • Works well on small datasets.
• • Effective in high-dimensional space.
• • Good generalization.
• Cons:
• • Slow for large datasets.
• • Not suitable for noisy data.
CONCLUSION
• • SVM finds optimal decision boundaries.
• • Works well in both linear and non-linear cases.
• • Choose SVM when you need a strong classifier with
small or medium data.