0% found this document useful (0 votes)

14 views31 pages

Mid 1 Answer

Uploaded by

mount3172

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views31 pages

Mid 1 Answer

Uploaded by

mount3172

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

Q.

1 Define Machine Learning

Machine Learning (ML):

● Machine → A device or system created by humans to perform tasks.

● Learning → The process of acquiring knowledge, behaviors, skills, or

values.

● Machine Learning → A computer system’s ability to learn from

experience, using algorithms and statistical models, by analyzing patterns
in data and improving its performance over time.

Simple Definition:

Machine Learning is when computers learn from data (experience) to

perform tasks better without being explicitly programmed.

Example:

● When you type on your phone, it predicts the next word → the system
learns from previous text patterns.

✅ Applications of Machine Learning

Machine learning is applied in almost every field of life. Below are key domains
and how they benefit from ML:

📊 1. Banking and Finance

● Challenge: Credit card fraud and customer churn.

● How ML helps:

○ Detects fraudulent transactions by identifying abnormal patterns.

○ Analyzes customer behavior and helps banks retain customers by
offering personalized plans.

🛡 2. Insurance
● Challenge: Managing risks and claims.

● How ML helps:

○ Predicts customer risk during onboarding by analyzing past data.

○ Improves claims management by spotting unusual claims or

predicting fraud.

🏥 3. Healthcare
● Challenge: Monitoring patient health.

● How ML helps:

○ Uses data from wearable devices to predict health issues in real

time.

○ Alerts doctors or users if a critical health condition is detected,

allowing preventive action.

📦 Other Examples (Additional Applications):

● Self-driving cars: Learns from environment data to navigate safely.

● AI personal assistants (e.g., Google Assistant): Understands speech

and schedules tasks.
● Recommendation systems (e.g., Amazon): Suggests products based on
user preferences.

● Spam filters: Detects and classifies unwanted emails.

● Image recognition: Identifies objects, faces, and scenes in pictures.

✅ Key Points to Remember

1. Machine + Learning = ML → Computers learning from data.

2. ML is everywhere → Finance, insurance, healthcare, transport,

e-commerce.

3. ML helps in prediction, classification, risk management, and

personalization.

4. Example → Fraud detection, health monitoring, self-driving cars.

5. It uses algorithms like supervised, unsupervised, and reinforcement

learning.

Here’s a detailed, point-wise, easy-to-remember explanation based on the

Bayes’ theorem and concept learning content from the PDF you uploaded
(Unit 3 – Bayesian Concept Learning):
Q.2 Bayes’ Theorem – Introduction
● What is it?
A mathematical formula that helps compute the probability of a hypothesis
being true given some evidence.

● Key idea:
We update our belief (probability) about a hypothesis when we see new
data.

● Who discovered it?

Named after Thomas Bayes.

● Used in:
Classification, decision making, spam filtering, medical diagnosis,
recommendation systems.

✅ Important Terms in Bayes’ Theorem

➤ Prior Probability (P(H))

● Represents what we believe about the hypothesis before seeing any new
data.

● Example → Probability that a patient has a malignant tumor, based on

general population statistics.

➤ Likelihood (P(X|H))

● The probability of observing the evidence given that the hypothesis is true.

● Example → If a patient has a malignant tumor, what’s the chance the lab
test shows positive?
➤ Posterior Probability (P(H|X))

● The updated belief after observing the evidence.

● It combines prior knowledge and the new evidence.

● Example → After seeing the lab test result, what’s the chance the patient
actually has a malignant tumor?

➤ Evidence (P(X))

● The total probability of observing the evidence, regardless of which

hypothesis is true.

✅ Bayes’ Theorem Formula

P(H∣X)=P(X∣H)×P(H)P(X)P(H|X) = \frac{P(X|H) \times P(H)}{P(X)}

● Posterior = (Likelihood × Prior) / Evidence

Here’s another example of Bayes’ Theorem, explained in an

easy-to-remember, point-wise format, different from the tumor example:

✅ Example – Email Spam Filtering

Scenario:

You receive an email and want to check whether it’s spam or not based on
certain keywords like “Buy now”, “Free offer”, etc.
➤ Given Information:

● 20% of all emails are spam → P(Spam) = 0.20

● 80% of all emails are not spam → P(Not Spam) = 0.80

● If an email is spam, there’s a 70% chance it contains the keyword “Buy

now” → P(Buy now | Spam) = 0.70

● If an email is not spam, there’s a 10% chance it contains the keyword “Buy
now” → P(Buy now | Not Spam) = 0.10

➤ Question:

If an email contains the keyword “Buy now”, what is the probability that it’s
actually spam?

That is, find P(Spam | Buy now).

✅ Key Points to Remember:
✔ Even though only 20% of all emails are spam, the presence of the keyword
“Buy now” makes it much more likely to be spam.
✔ The prior probability (P(Spam)) represents how common spam is in general.
✔ The likelihood (P(Buy now | Spam)) represents how often spam emails use
that keyword.
✔ The posterior probability tells us how suspicious the email is after seeing the
keyword.

✅ Applications
✔ Spam filtering in email systems
✔ Identifying phishing attempts
✔ Recommender systems analyzing user activity
✔ Fraud detection in financial transactions

✅ Key Insights
✔ Bayes’ theorem helps update beliefs based on new evidence.
✔ Prior knowledge is critical in determining the outcome.
✔ Likelihood shows how strongly the evidence supports the hypothesis.
✔ Posterior gives the final updated probability after considering evidence.
✔ Even with accurate tests, rare conditions can still yield unexpected results.

✅ Use Cases
✔ Medical diagnosis (e.g., cancer, diseases)
✔ Spam filtering in emails
✔ Sentiment analysis in social media
✔ Recommendation systems (e.g., shopping sites)
✔ Decision-making in AI models

✅ Final Summary – Easy to Remember

● Prior → What you knew before seeing the data.

● Likelihood → How well the data supports the hypothesis.

● Posterior → What you believe after seeing the data.

● Formula → Posterior = (Likelihood × Prior) / Evidence.

Q.3 Types of Machine Learning – Overview
Machine learning can be grouped into three main types, depending on how the
learning is done and how the data is used:

1. Supervised Learning

2. Unsupervised Learning

3. Reinforcement Learning

Each type has its own use cases, methods, and examples.

✅ 1. Supervised Learning
👉 Also called predictive learning.
✅ Definition:
● The model learns from labeled data (input + correct output).

● It uses past data to predict or classify new data.

✅ Key Characteristics:
✔ The data has inputs and outputs clearly labeled.
✔ The model learns the relationship between inputs and outputs.
✔ It’s used when the goal is to predict outcomes.

✅ Examples:
● Predicting whether a tumor is malignant or benign.

● Predicting house prices based on area, location, etc.

● Classifying emails as spam or non-spam.

● Forecasting demand or stock prices.

✅ Types within Supervised Learning:

1. Classification:
✔ Predicts a category or class.
✔ Example: Image classification → Cat, Dog, or Bird.

2. Regression:
✔ Predicts a continuous value.
✔ Example: Forecasting weather temperatures or sales numbers.

✅ Important Notes:
● Accuracy depends on the quality of labeled data.

● Better data → better prediction.

✅ 2. Unsupervised Learning
👉 Also called descriptive learning.
✅ Definition:
● The model learns from unlabeled data without known outputs.

● It finds patterns or structures within the data.

✅ Key Characteristics:
✔ The data is unlabeled → no correct answers provided.
✔ The model groups data or finds patterns.
✔ It’s used to explore hidden structures in data.

✅ Examples:
● Customer segmentation → Grouping customers based on purchase
behavior.

● Credit card fraud detection → Identifying suspicious patterns.

● Recommendation systems → Finding similar products for users.

✅ Types within Unsupervised Learning:

1. Clustering:
✔ Groups similar data points together.
✔ Example: Grouping customers with similar shopping habits.

2. Association:
✔ Finds rules or relationships between data items.
✔ Example: Market basket analysis → People who buy bread also buy
butter.
✅ Important Notes:
● Good for exploratory data analysis.

● Helps in identifying hidden relationships.

✅ 3. Reinforcement Learning
👉 Also called feedback-based learning.
✅ Definition:
● The model learns by trial and error, interacting with the environment.

● It receives rewards or penalties based on actions.

✅ Key Characteristics:
✔ There is no fixed dataset → learning happens in real-time.
✔ The model learns from consequences of actions.
✔ It’s used where long-term goals matter.

✅ Examples:
● Self-driving cars → Learn to navigate safely.

● Robots → Learn optimal paths or actions.

● Game-playing AI → Learns strategies by competing against itself.

✅ Important Notes:
● Works in dynamic environments.

● Optimizes performance over time.

✅ Comparison of Types of Machine Learning

Feature Supervised Unsupervised Reinforcement
Learning Learning Learning

Type of Labeled data Unlabeled data Interaction data

data

Output Prediction/classificati Patterns/clusters Actions & rewards

Use case Forecasting, Grouping, discovering Navigation, gaming

classification patterns

Feedbac Supervised signal No feedback Reward/Penalty

k signal

Example Spam detection Customer Self-driving cars

segmentation
✅ Final Summary – Easy to Remember
✔ Supervised → “Teacher-guided learning”
✔ Unsupervised → “Finding hidden patterns”
✔ Reinforcement → “Learning from rewards & penalties”

Q.4 Support Vector Machine (SVM) – Basics

✔ SVM is a supervised learning algorithm
✔ Mainly used for classification, but also applied in regression
✔ It finds the best decision boundary (called a hyperplane) that separates
classes
✔ The goal is to maximize the margin — the distance between the classes and
the hyperplane
✔ The data points closest to the hyperplane are called support vectors, and
they define the hyperplane

📌 Key Terms
➤ Hyperplane

● The decision boundary that separates different classes

● In 2D, it’s a line; in 3D, it’s a plane; in higher dimensions, it’s a hyperplane

● We aim to find the best hyperplane that separates the classes with the
largest margin

➤ Support Vectors

● Data points closest to the hyperplane

● They “support” or define the boundary

● These are the most critical elements that the SVM uses to construct the
decision boundary

➤ Maximum Margin Hyperplane (MMH)

● The hyperplane that maximizes the distance (margin) between the closest
data points of both classes

● A wider margin helps the model generalize better and avoid

misclassification

● For linearly separable data, MMH is easy to find by enclosing data points in
convex hulls

● For non-linearly separable data, SVM uses the kernel trick to transform
data into higher dimensions where it becomes separable

✅ Types of SVM
1️⃣ Linear SVM

● Used when the data can be separated with a straight line or hyperplane

● Example → If two classes can be divided by a line in a 2D plot, Linear

SVM is applicable

2️⃣ Non-Linear SVM

● Used when data cannot be separated by a straight line

● SVM uses kernel functions (like polynomial or radial basis function) to
transform data into higher dimensions

● Example → Complex datasets like circles, spirals, or overlapping clusters

can be handled using Non-linear SVM

✅ Algorithm Steps (Basic Idea)

1. Input: Training data with labeled classes

2. Find the best hyperplane that maximizes the margin between classes

3. Identify the support vectors closest to the boundary

4. For non-linear problems, apply kernel functions to map data to a

higher-dimensional space

5. Optimize the hyperplane to reduce misclassification

6. Classify new data points based on their position relative to the hyperplane

✅ Strengths of SVM
✔ Works for both classification and regression
✔ Effective with noisy data and outliers
✔ Provides promising prediction results
✔ Well-suited for binary classification tasks
✔ Maximizes margin for better generalization

✅ Weaknesses of SVM
✔ Mostly applicable to binary classification
✔ Complex and hard to interpret in high-dimensional spaces
✔ Slow for large datasets with many features or instances
✔ Memory-intensive computations required for large datasets
✔ Difficult to understand the model like a “black box”

✅ Applications of SVM
✔ Bioinformatics → Detecting cancer or genetic disorders by classifying data
into two groups
✔ Face Detection → Separating images into face vs. non-face
✔ Image Classification → Identifying objects in images
✔ Text Categorization → Classifying documents, emails, or news articles
✔ Financial predictions → Credit risk assessment, stock trends

✅ Final Summary – Easy to Remember

✔ SVM = Finding the best boundary to separate data
✔ Support Vectors = Critical points that define the boundary
✔ MMH = Largest margin between classes → better generalization
✔ Linear SVM → straight line separation
✔ Non-linear SVM → kernel trick for complex data
✔ Strengths → robust, accurate, margin maximization
✔ Weaknesses → binary focus, complex, slow for big data
✔ Applications → healthcare, image analysis, text classification
Q.5 k-Nearest Neighbors (kNN) – Basics
✔ kNN is a simple, supervised learning algorithm
✔ Used for classification and regression, but primarily for classification
✔ It works by comparing a new data point with its k closest neighbors from the
training set
✔ The output is determined by the majority class among the neighbors (for
classification) or the average of neighbors (for regression)

📌 How kNN Works

1. Choose k → The number of neighbors to consider

2. Measure distance → Find the distance between the new data point and
all training points (common methods: Euclidean, Manhattan)

3. Select neighbors → Pick the k closest training points

4. Vote or average →

○ Classification → Take the majority class among neighbors

○ Regression → Take the average of the neighbors’ values

5. Assign the label → Based on the vote or average, assign the class or
value to the new point

✅ Key Characteristics
✔ Instance-based → Learns by comparing examples, not by building a model
✔ Lazy learning → Doesn’t train a model upfront; computes at the time of
classification
✔ Non-parametric → Doesn’t make assumptions about data distribution
✅ Strengths of kNN
✔ Simple and intuitive – easy to understand and implement
✔ No training phase – stores training data and makes predictions during
classification
✔ Effective for small datasets – works well when data is not huge
✔ Adaptable to non-linear data – doesn’t require the data to be linearly
separable
✔ Works with multi-class problems – can classify into more than two
categories

✅ Weaknesses of kNN
✔ Computationally expensive – requires distance calculation with all data
points for each prediction
✔ Sensitive to irrelevant or noisy features – irrelevant data can mislead
classification
✔ Needs careful choice of k – too small leads to noise influence; too large
leads to smoothing over differences
✔ Memory-intensive – stores all training data in memory
✔ Curse of dimensionality – performance drops when dealing with
high-dimensional data due to sparse distances

✅ Applications of kNN
✔ Pattern Recognition → Handwriting, digit recognition
✔ Medical Diagnosis → Classifying diseases based on patient attributes
✔ Recommendation Systems → Suggest products similar to previous ones
✔ Credit Risk Analysis → Classifying loan applicants as high or low risk
✔ Customer Behavior Prediction → Understanding customer preferences in
marketing
✔ Image Recognition → Identifying objects based on similar images
✅ Final Summary – Easy to Remember
✔ kNN = “Find your neighbors and ask them what they are!”
✔ Strengths → Simple, no training, flexible
✔ Weaknesses → Slow, needs memory, sensitive to noise and irrelevant
data
✔ Applications → Medical, finance, marketing, image recognition

Here’s a detailed, point-wise, easy-to-remember explanation about

Regression, its types, and Linear Regression with example:

✅ What is Regression?
✔ Regression is a supervised learning technique used to predict a
continuous value based on input features.
✔ It finds the relationship between the dependent variable (output) and
independent variables (inputs).
✔ Used when the target variable is numerical, such as price, temperature, or
salary.
✔ The goal is to fit a model that best explains how the output depends on the
inputs.

📌 Key Points
✔ It’s not about classifying into categories → it predicts numbers
✔ It helps in forecasting, trend analysis, and estimating unknown values
✔ It assumes that there is some underlying pattern in the data
✅ Types of Regression
1. Linear Regression
✔ Relationship is modeled with a straight line

2. Multiple Linear Regression

✔ More than one input feature is used to predict the output

3. Polynomial Regression

✔ Models non-linear relationships by using polynomial functions

4. Ridge Regression

✔ A regularization method to reduce overfitting by penalizing large
coefficients

5. Lasso Regression

✔ Another regularization technique that can shrink some coefficients to
zero

6. Logistic Regression (though technically classification)

✔ Predicts probability for categorical outcomes

7. Support Vector Regression (SVR)

✔ Uses Support Vector Machine principles for regression problems

Q.6 Linear Regression – Detailed Explanation

✔ Definition:

Linear regression tries to find the best-fitting straight line that predicts the output
(Y) from one or more inputs (X).

✔ Equation of a line:

Y=mX+cY = mX + c
Where:
✔ YY = predicted output
✔ XX = input feature
✔ mm = slope of the line (how much Y changes with X)
✔ cc = intercept (value of Y when X is 0)

✅ Example – Predicting House Prices

Problem:
You want to predict the price of a house based on its area (in square feet).

Given Data:

Area (sq Price (in

ft) ₹1000s)

1000 200

1500 250

2000 300

2500 350

3000 400
✅ Strengths of Linear Regression
✔ Simple and easy to understand
✔ Provides clear relationship between input and output
✔ Works well when data shows a linear trend
✔ Good for prediction and forecasting
✔ Helps identify which features affect the output the most
✅ Weaknesses of Linear Regression
✔ Doesn’t work well if the relationship is not linear
✔ Sensitive to outliers → extreme values can skew the results
✔ Assumes constant variance and normal distribution of errors
✔ Not suitable for complex, multi-dimensional problems without transformation

✅ Applications of Linear Regression

✔ House price prediction
✔ Stock market forecasting
✔ Salary estimation based on experience
✔ Demand forecasting in business
✔ Agricultural yield prediction
✔ Temperature and rainfall analysis

✅ Final Summary – Easy to Remember

✔ Regression = Predicting numbers, not categories
✔ Linear Regression = Best straight line through data points
✔ Formula → Y=mX+cY = mX + c
✔ Example → Predict house price from area
✔ Strengths → Simple, interpretable, fast
✔ Weaknesses → Sensitive to outliers, assumes linearity
✔ Applications → Finance, real estate, agriculture, weather forecasting
Q.7
1. Hierarchical Clustering
👉 Groups data into a hierarchy of clusters without predefining the number of
clusters.

Key Points

1. Definition

○ Builds a tree-like structure (dendrogram) of nested clusters.

○ Clusters are formed based on distance matrix instead of specifying

2. Types

○ Agglomerative (Bottom-Up)

■ Start: Each data point = its own cluster.

■ At each step: Merge the two most similar clusters.

■ Stop: When all objects merge into one big cluster.

■ Example: AGNES (Agglomerative Nesting).

○ Divisive (Top-Down)

■ Start: All data in one cluster.

■ At each step: Split the most heterogeneous cluster.

■ Stop: Until each object is a separate cluster.

■ Example: DIANA (Divisive Analysis).

3. Distance Measures Between Clusters

○ Single Link → Minimum distance between two points of different

clusters.

○ Complete Link → Maximum distance between two points of

different clusters.

○ Average Link → Average distance between points across clusters.

○ Centroid → Distance between centroids.

○ Medoid → Distance between most central points (medoids).

4. Dendrogram

○ A tree diagram showing how clusters are merged/split.

○ By “cutting” at a desired level → final clusters are obtained.

5. Strengths

○ Easy to understand and interpret.

○ No need to pre-define k.

○ Good visualization via dendrogram.

6. Weaknesses

○ Once merged/split → cannot be undone.

○ Poor with large datasets and mixed data types.

○ Sensitive to missing data.

○ Dendrograms are often misinterpreted.

2. K-Means Clustering
👉 A partitioning clustering method based on centroids.
Key Points

1. Definition

○ Groups n objects into k clusters.

○ Each cluster is represented by a centroid (mean point).

○ Objective: Minimize Sum of Squared Errors (SSE).

2. Algorithm Steps

○ Choose k (number of clusters).

○ Initialize → Randomly select k objects as initial centroids.

○ Assignment → Assign each object to the nearest centroid.

○ Update → Recompute centroids of clusters.

○ Repeat steps 3–4 until centroids do not change (convergence).

3. Concept

○ Uses Euclidean distance (commonly) to measure similarity.

○ Works by iterative relocation (objects may be reassigned
repeatedly).

4. Choosing k

○ Done using Elbow Method (plot SSE vs. k, choose elbow point).

○ Or Silhouette score.

5. Advantages

○ Simple and fast.

○ Works well on large datasets.

○ Produces tighter clusters.

6. Limitations

○ Must specify k beforehand.

○ Only works when mean is defined (not categorical data).

○ Struggles with non-convex clusters or different sized clusters.

○ Sensitive to noise and outliers.

7. Example (from notes)

○ Data points: A1(2,10), A2(2,5), A3(8,4), B1(5,8), B2(7,5), B3(6,4),

C1(1,2), C2(4,9).

○ k = 3, initial centers chosen → Iteratively refine until stable clusters

are formed.
✅ Easy to Remember Tip
● Hierarchical → Tree (AGNES & DIANA)

● K-Means → Centroid & Iteration (Partitioning)

Q.7

🔹 How Overfitting in Decision Trees Can Be Avoided

1. Pruning the Tree

○ Pre-pruning (early stopping): Stop splitting when nodes become too

small or improvement is negligible.

○ Post-pruning: Grow full tree first, then remove unnecessary

branches.

2. Restrict Tree Depth

○ Limit maximum depth → avoids too many levels → reduces

complexity.

3. Minimum Samples per Split/Leaf

○ Require a minimum number of samples before splitting a node.

○ Prevents tree from fitting noise in small sample subsets.

4. Limit Number of Features

○ Restrict number of features considered at each split to avoid overly
complex boundaries.

5. Use Ensemble Methods

○ Combine multiple trees (Random Forest, Gradient Boosting) →

reduces variance and prevents overfitting.

👉 Easy Tip to Remember:

Think of “PRUNE + LIMIT” → Prune tree, Limit depth, Limit samples, Limit
features, Use ensembles.

🔹 Out-of-Bag (OOB) Error in Random Forest

1. Bootstrap Sampling

○ Each tree is trained on a random sample (with replacement) of the

dataset.

○ About 2/3rd of samples are used → remaining 1/3rd are left out
(called Out-of-Bag samples).

2. OOB Testing

○ The left-out samples (not used for training a tree) are used as a test
set for that tree.

○ Gives an unbiased estimate of prediction error.

3. OOB Error Rate

○ Average error across all trees, measured using their respective OOB
samples.
○ Acts like a built-in cross-validation for Random Forest.

4. Advantages of OOB

○ No need for separate validation dataset.

○ Saves computation time.

○ Provides reliable error estimate.

👉 Easy Tip to Remember:

OOB = “Free Test Set” → Each tree ignores some data → That ignored data
tests the tree → Gives error estimate.

Ai Notes
No ratings yet
Ai Notes
8 pages
Introduction to Data Science Concepts
No ratings yet
Introduction to Data Science Concepts
56 pages
ML Unit1
No ratings yet
ML Unit1
6 pages
Intro Machine Learning
No ratings yet
Intro Machine Learning
4 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
14 pages
Sonu Dkash Updated PDF
No ratings yet
Sonu Dkash Updated PDF
21 pages
ML - Module 1
No ratings yet
ML - Module 1
30 pages
Module 1 ML
No ratings yet
Module 1 ML
8 pages
Machine Learning Practical File
No ratings yet
Machine Learning Practical File
41 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
316 pages
Machine Learning Unil-1
No ratings yet
Machine Learning Unil-1
20 pages
Sec 1630
No ratings yet
Sec 1630
145 pages
MLES
No ratings yet
MLES
30 pages
ML Insights for Researchers & Practitioners
No ratings yet
ML Insights for Researchers & Practitioners
17 pages
Unit 1
No ratings yet
Unit 1
92 pages
ML Ans
No ratings yet
ML Ans
13 pages
AI Module 1 Simple Notes
No ratings yet
AI Module 1 Simple Notes
14 pages
A Preliminary Idea On Machine Learning
No ratings yet
A Preliminary Idea On Machine Learning
40 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
Machine Learning Notes
91% (11)
Machine Learning Notes
19 pages
Unit 1
No ratings yet
Unit 1
51 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
48 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
39 pages
Ch7 Introduction To Machine Learning
No ratings yet
Ch7 Introduction To Machine Learning
29 pages
Machine Learning: Professor Department of Computer Science & Engineering
No ratings yet
Machine Learning: Professor Department of Computer Science & Engineering
59 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
21 pages
Machine Learning - 1
No ratings yet
Machine Learning - 1
52 pages
Machine Learning Introduction and Types
No ratings yet
Machine Learning Introduction and Types
7 pages
BE02000041 Funda of AI Unit 3 Basics of ML
No ratings yet
BE02000041 Funda of AI Unit 3 Basics of ML
86 pages
Asset-V1 MKAU+SEng9032+DEV 01+type@asset+block@ChapOne
No ratings yet
Asset-V1 MKAU+SEng9032+DEV 01+type@asset+block@ChapOne
29 pages
Advanced Quantum Manifestation Guide
No ratings yet
Advanced Quantum Manifestation Guide
5 pages
Discrete Fourier Transform Overview
No ratings yet
Discrete Fourier Transform Overview
2 pages
306-Stresses in Foundation Soils Due To Vertical Subsurface Loading, Gedde
No ratings yet
306-Stresses in Foundation Soils Due To Vertical Subsurface Loading, Gedde
25 pages
Quantitative Methods in Management
No ratings yet
Quantitative Methods in Management
100 pages
Practice Question On Capital Budgeting
No ratings yet
Practice Question On Capital Budgeting
4 pages
Electrostatics Material
No ratings yet
Electrostatics Material
7 pages
Grashof Law
No ratings yet
Grashof Law
7 pages
Assignment 8.24 - Elements of Line Exercise
No ratings yet
Assignment 8.24 - Elements of Line Exercise
17 pages
Importance of Statistics in Data Management
No ratings yet
Importance of Statistics in Data Management
3 pages
Patterns and Mathematics in Nature
No ratings yet
Patterns and Mathematics in Nature
26 pages
18EE3AI22 Kulkarni Yash Rajendra AI69002 Design Lab Report
No ratings yet
18EE3AI22 Kulkarni Yash Rajendra AI69002 Design Lab Report
3 pages
Intervention Activities of Elementary Mathematics Teachers Implemented in The New Normal
No ratings yet
Intervention Activities of Elementary Mathematics Teachers Implemented in The New Normal
6 pages
Artificial Intelligence Curriculum Plasmid PDF
No ratings yet
Artificial Intelligence Curriculum Plasmid PDF
5 pages
Wma14 01 Que 20230120
No ratings yet
Wma14 01 Que 20230120
32 pages
Class 10 QP of Pioneer Education
No ratings yet
Class 10 QP of Pioneer Education
42 pages
Mechanics 1
No ratings yet
Mechanics 1
7 pages
Grade 10 & 11 Study Guide
No ratings yet
Grade 10 & 11 Study Guide
8 pages
SSC CGL 2024 Detailed Syllabus Guide
No ratings yet
SSC CGL 2024 Detailed Syllabus Guide
15 pages
Bored Pile Capacity by Direct SPT Methods Applied To 40 Case Histories PDF
No ratings yet
Bored Pile Capacity by Direct SPT Methods Applied To 40 Case Histories PDF
5 pages
Oberlin Physics 110: Mechanics & Relativity
No ratings yet
Oberlin Physics 110: Mechanics & Relativity
151 pages
107 - 76409 - Session Wise Problems
No ratings yet
107 - 76409 - Session Wise Problems
14 pages
DLL Matatag - Mathematics 8q1 w1
No ratings yet
DLL Matatag - Mathematics 8q1 w1
11 pages
Importat Question Panda Series
No ratings yet
Importat Question Panda Series
27 pages
Case-Based Reasoning Overview and Applications
No ratings yet
Case-Based Reasoning Overview and Applications
12 pages
Big M Method
No ratings yet
Big M Method
28 pages
CH 8 Volume and Surface Area of Prism and Pyramid
100% (1)
CH 8 Volume and Surface Area of Prism and Pyramid
46 pages
Adobe Scan Dec 30, 2023
No ratings yet
Adobe Scan Dec 30, 2023
22 pages
How to Evaluate Machine Learning Models
No ratings yet
How to Evaluate Machine Learning Models
14 pages
Symmetric Bilinear Form - Wikipedia, The Free Encyclopedia
No ratings yet
Symmetric Bilinear Form - Wikipedia, The Free Encyclopedia
3 pages
A Curse of Queens Amanda Bouchet Download
No ratings yet
A Curse of Queens Amanda Bouchet Download
100 pages

Mid 1 Answer

Uploaded by

Mid 1 Answer

Uploaded by

Q.

1 Define Machine Learning

●​ Machine → A device or system created by humans to perform tasks.​

●​ Learning → The process of acquiring knowledge, behaviors, skills, or

●​ Machine Learning → A computer system’s ability to learn from

Machine Learning is when computers learn from data (experience) to

✅ Applications of Machine Learning

📊 1. Banking and Finance

○​ Detects fraudulent transactions by identifying abnormal patterns.​

○​ Predicts customer risk during onboarding by analyzing past data.​

○​ Improves claims management by spotting unusual claims or

○​ Uses data from wearable devices to predict health issues in real

○​ Alerts doctors or users if a critical health condition is detected,

📦 Other Examples (Additional Applications):

●​ AI personal assistants (e.g., Google Assistant): Understands speech

●​ Spam filters: Detects and classifies unwanted emails.​

●​ Image recognition: Identifies objects, faces, and scenes in pictures.​

✅ Key Points to Remember

2.​ ML is everywhere → Finance, insurance, healthcare, transport,

3.​ ML helps in prediction, classification, risk management, and

4.​ Example → Fraud detection, health monitoring, self-driving cars.​

5.​ It uses algorithms like supervised, unsupervised, and reinforcement

Here’s a detailed, point-wise, easy-to-remember explanation based on the

●​ Who discovered it?​

✅ Important Terms in Bayes’ Theorem

●​ Example → Probability that a patient has a malignant tumor, based on

●​ The updated belief after observing the evidence.​

●​ It combines prior knowledge and the new evidence.​

●​ The total probability of observing the evidence, regardless of which

✅ Bayes’ Theorem Formula

●​ Posterior = (Likelihood × Prior) / Evidence​

Here’s another example of Bayes’ Theorem, explained in an

✅ Example – Email Spam Filtering

●​ 20% of all emails are spam → P(Spam) = 0.20​

●​ 80% of all emails are not spam → P(Not Spam) = 0.80​

●​ If an email is spam, there’s a 70% chance it contains the keyword “Buy

That is, find P(Spam | Buy now).

✅ Final Summary – Easy to Remember

●​ Likelihood → How well the data supports the hypothesis.​

●​ Posterior → What you believe after seeing the data.​

●​ Formula → Posterior = (Likelihood × Prior) / Evidence.​

1.​ Supervised Learning​

2.​ Unsupervised Learning​

3.​ Reinforcement Learning​

●​ It uses past data to predict or classify new data.​

●​ Predicting house prices based on area, location, etc.​

●​ Classifying emails as spam or non-spam.​

●​ Forecasting demand or stock prices.​

✅ Types within Supervised Learning:

●​ Better data → better prediction.​

●​ It finds patterns or structures within the data.​

●​ Credit card fraud detection → Identifying suspicious patterns.​

●​ Recommendation systems → Finding similar products for users.​

✅ Types within Unsupervised Learning:

●​ Helps in identifying hidden relationships.​

●​ It receives rewards or penalties based on actions.​

●​ Robots → Learn optimal paths or actions.​

●​ Game-playing AI → Learns strategies by competing against itself.​

●​ Optimizes performance over time.​

✅ Comparison of Types of Machine Learning

Type of Labeled data Unlabeled data Interaction data

Output Prediction/classificati Patterns/clusters Actions & rewards

Use case Forecasting, Grouping, discovering Navigation, gaming

Feedbac Supervised signal No feedback Reward/Penalty

Example Spam detection Customer Self-driving cars

Q.4 Support Vector Machine (SVM) – Basics

●​ The decision boundary that separates different classes​

●​ Data points closest to the hyperplane​

➤ Maximum Margin Hyperplane (MMH)

●​ A wider margin helps the model generalize better and avoid

●​ Example → If two classes can be divided by a line in a 2D plot, Linear

2️⃣ Non-Linear SVM

●​ Used when data cannot be separated by a straight line​

●​ Example → Complex datasets like circles, spirals, or overlapping clusters

✅ Algorithm Steps (Basic Idea)

3.​ Identify the support vectors closest to the boundary​

● Machine → A device or system created by humans to perform tasks.

● Learning → The process of acquiring knowledge, behaviors, skills, or

● Machine Learning → A computer system’s ability to learn from

○ Detects fraudulent transactions by identifying abnormal patterns.

○ Predicts customer risk during onboarding by analyzing past data.

○ Improves claims management by spotting unusual claims or

○ Uses data from wearable devices to predict health issues in real

○ Alerts doctors or users if a critical health condition is detected,

● AI personal assistants (e.g., Google Assistant): Understands speech

● Spam filters: Detects and classifies unwanted emails.

● Image recognition: Identifies objects, faces, and scenes in pictures.

2. ML is everywhere → Finance, insurance, healthcare, transport,

3. ML helps in prediction, classification, risk management, and

4. Example → Fraud detection, health monitoring, self-driving cars.

5. It uses algorithms like supervised, unsupervised, and reinforcement

● Who discovered it?

● Example → Probability that a patient has a malignant tumor, based on

● The updated belief after observing the evidence.

● It combines prior knowledge and the new evidence.

● The total probability of observing the evidence, regardless of which

● Posterior = (Likelihood × Prior) / Evidence

● 20% of all emails are spam → P(Spam) = 0.20

● 80% of all emails are not spam → P(Not Spam) = 0.80

● If an email is spam, there’s a 70% chance it contains the keyword “Buy

● Likelihood → How well the data supports the hypothesis.

● Posterior → What you believe after seeing the data.

● Formula → Posterior = (Likelihood × Prior) / Evidence.

1. Supervised Learning

2. Unsupervised Learning

3. Reinforcement Learning

● It uses past data to predict or classify new data.

● Predicting house prices based on area, location, etc.

● Classifying emails as spam or non-spam.

● Forecasting demand or stock prices.

● Better data → better prediction.

● It finds patterns or structures within the data.

● Credit card fraud detection → Identifying suspicious patterns.

● Recommendation systems → Finding similar products for users.

● Helps in identifying hidden relationships.

● It receives rewards or penalties based on actions.

● Robots → Learn optimal paths or actions.

● Game-playing AI → Learns strategies by competing against itself.

● Optimizes performance over time.

● The decision boundary that separates different classes

● Data points closest to the hyperplane

● A wider margin helps the model generalize better and avoid

● Example → If two classes can be divided by a line in a 2D plot, Linear

● Used when data cannot be separated by a straight line

● Example → Complex datasets like circles, spirals, or overlapping clusters

3. Identify the support vectors closest to the boundary

4. For non-linear problems, apply kernel functions to map data to a

5. Optimize the hyperplane to reduce misclassification

3. Select neighbors → Pick the k closest training points

4. Vote or average →

○ Classification → Take the majority class among neighbors

○ Regression → Take the average of the neighbors’ values

2. Multiple Linear Regression

3. Polynomial Regression

4. Ridge Regression

5. Lasso Regression

6. Logistic Regression (though technically classification)

7. Support Vector Regression (SVR)

○ Builds a tree-like structure (dendrogram) of nested clusters.

○ Clusters are formed based on distance matrix instead of specifying

■ Start: Each data point = its own cluster.

■ At each step: Merge the two most similar clusters.

■ Stop: When all objects merge into one big cluster.

■ Example: AGNES (Agglomerative Nesting).

■ Start: All data in one cluster.

■ At each step: Split the most heterogeneous cluster.

■ Stop: Until each object is a separate cluster.

3. Distance Measures Between Clusters

○ Single Link → Minimum distance between two points of different

○ Complete Link → Maximum distance between two points of

○ Average Link → Average distance between points across clusters.

○ Centroid → Distance between centroids.

○ Medoid → Distance between most central points (medoids).

○ A tree diagram showing how clusters are merged/split.

○ By “cutting” at a desired level → final clusters are obtained.

○ Easy to understand and interpret.