0% found this document useful (0 votes)

6 views9 pages

AIML Question Ans Part1

Having basic questions and answers of Machine learning

Uploaded by

khan adil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views9 pages

AIML Question Ans Part1

Having basic questions and answers of Machine learning

Uploaded by

khan adil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Name: Khan Adil Parvez

Enrollment No:A70466225003
Branch:CSE
Batch: Jan-2025

Q.1: Explain about Linear Regression along with example?

Ans: Linear regression is used to predict the relationship between two variables
by applying a linear equation to observed data. There are two types of variable,
one variable is called an independent variable, and the other is a dependent
variable. Linear regression is commonly used for predictive analysis. The main
idea of regression is to examine two things. First, does a set of predictor
variables do a good job in predicting an outcome (dependent) variable? The
second thing is which variables are significant predictors of the outcome
variable?

Linear Regression Example

Example 1: Linear regression can predict house prices based on size.
For example, if the formula is:
Price = 50,000 + 100 × Size (sq. ft),
a 2,000 sq. ft. house would cost:
Price = 50,000 + 100 × 2,000 = 250,000.
It helps find relationships and make predictions.
Example 2: Linear regression can predict sales based on advertising spend. For
example, if the formula is:
Sales = 5,000 + 20 × Ad Spend (in $1,000s),
and a company spends $50,000 on ads:

Sales = 5,000 + 20 × 50 = 105,000.

Q.2: What are the assumptions of linear regression?

Ans:
1. Linearity:

● Assumption: There is a linear relationship between the independent

variables (predictors) and the dependent variable (response).
● Explanation: This means that the change in the dependent variable should
be proportional to the change in the independent variable(s). If the
relationship is nonlinear, linear regression may not be the best model, and
alternative methods like polynomial regression might be needed.

2. Independence of Errors:

● Assumption: The residuals (errors) are independent of each other.

● Explanation: The residuals are the differences between the observed and
predicted values. For the model to be reliable, the residuals should not be
correlated with each other. This assumption is particularly important
when dealing with time series data, where autocorrelation of errors can
occur.

3. Homoscedasticity:

● Assumption: The variance of the residuals (errors) is constant across all

levels of the independent variables.
● Explanation: Homoscedasticity means that the spread (or variability) of
the residuals should be the same for all predicted values of YYY. If the
variance of residuals changes as the predicted values change
(heteroscedasticity), it can indicate problems with the model and affect
the reliability of statistical tests.

4. Normality of Errors:

● Assumption: The residuals (errors) are normally distributed.

● Explanation: For the purposes of hypothesis testing (like t-tests for the
regression coefficients) and calculating confidence intervals, the residuals
should follow a normal distribution. While this assumption is important
for inference, linear regression can still provide unbiased predictions even
if this assumption is somewhat violated, though statistical significance
might be affected.

5. No Multicollinearity (for multiple linear regression):

● Assumption: The independent variables are not highly correlated with
each other.
● Explanation: In multiple linear regression, multicollinearity occurs when
two or more independent variables are highly correlated with each other.
This can make it difficult to isolate the effect of each individual predictor
on the dependent variable, leading to unreliable estimates of the
coefficients.

6. No Measurement Error in Independent Variables:

● Assumption: The independent variables are measured accurately with no

error.
● Explanation: Measurement error in the independent variables can cause
biased regression coefficients, leading to incorrect conclusions. In
practice, it’s challenging to ensure no error in measurement, but
minimizing this error is important for obtaining valid results.

Q.3: Explain about Logistic Regression with example.

Ans:
Logistic Regression is a statistical method used to predict the probability of a
binary outcome (yes/no, 0/1) based on one or more independent variables,
where the outcome is modeled using a sigmoid function to ensure the predicted
probability falls between 0 and 1; essentially, it helps determine the likelihood
of a specific event occurring given certain input factors.

Example:
● Predicting whether a customer will purchase a product online:
● Independent variables: Customer's age, income level, time spent
browsing the website, number of items added to cart.
● Dependent variable: Whether the customer purchases the product
(yes/no).
● How it works: The logistic regression model analyzes past
customer data to identify patterns between these variables and the
purchase decision, then calculates the probability of a new
customer making a purchase based on their individual
characteristics.

Q.4: What are the assumption of Logistic Regression

Ans:
Logistic regression does not make many of the key assumptions of linear
regression and general linear models that are based on ordinary least squares
algorithms – particularly regarding linearity, normality, homoscedasticity, and
measurement level.
First, logistic regression does not require a linear relationship between the
dependent and independent variables. Second, the error terms (residuals) do not
need to follow a normal distribution. Third, you do not require
homoscedasticity. Finally, logistic regression does not require you to measure
the dependent variable on an interval or ratio scale.

However, some other assumptions still apply.

First, binary logistic regression requires the dependent variable to be binary and
ordinal logistic regression requires the dependent variable to be ordinal.

Second, logistic regression requires the observations to be independent of each

other. In other words, the observations should not come from repeated
measurements or matched data.

Third, logistic regression requires there to be little or no multicollinearity

among the independent variables. Meaning, that the independent variables
should not be too highly correlated with each other.

Fourth, logistic regression assumes linearity of independent variables and log

odds of the dependent variable. Although this analysis does not require the
dependent and independent variables to be related linearly, it requires that the
independent variables are linearly related to the log odds of the dependent
variable.

Finally, logistic regression typically requires a large sample size. A general

guideline is that you need at minimum of 10 cases with the least frequent
outcome for each independent variable in your model.
Q.5: Enlist and explain performance metrics of Regression.
Ans:

In regression analysis, various performance metrics are used to evaluate how

well the model predicts the continuous target variable. Here's a list of key
performance metrics for regression, along with an explanation of each:

1. Mean Absolute Error (MAE)

● Definition: MAE is the average of the absolute differences between the

actual and predicted values.
● Formula:

Where, yi is the actual value, y^i i is the predicted value, and n is the
number of observations.
● Interpretation: MAE gives a linear score, meaning that all errors are
weighted equally. It provides a simple interpretation of how far off, on
average, the predictions are from the true values.

2. Mean Squared Error (MSE)

● Definition: MSE is the average of the squared differences between the

actual and predicted values. It penalizes larger errors more than MAE.
● Formula:

● Interpretation: Since MSE squares the errors, larger deviations from the
true values have a disproportionately large effect on the metric. This
makes MSE more sensitive to outliers than MAE.

3. Root Mean Squared Error (RMSE)

● Definition: RMSE is the square root of the MSE. It returns the error in the
same units as the target variable.
● Formula:
● Interpretation: RMSE is useful for measuring how spreads out the
residuals are. It is more sensitive to larger errors than MAE, similar to
MSE, and is interpreted in the same units as the target variable.

4. R-squared (R²)

● Definition: R-squared measures the proportion of the variance in the

target variable that is explained by the regression model.
● Formula:

Where, yˉ\bar{y} is the mean of the actual values.

● Interpretation: R² ranges from 0 to 1, with a higher value indicating a
better fit. An R² of 1 means the model perfectly explains the variance in
the target variable, while an R² of 0 means the model explains none of the
variance.

5. Adjusted R-squared

● Definition: Adjusted R-squared is a modified version of R-squared that

accounts for the number of predictors in the model. It is used to compare
models with a different number of predictors.
● Formula:

Where, n is the number of data points, and p is the number of predictors.

● Interpretation: Unlike R-squared, the adjusted R-squared will decrease if
Irrelevant predictors are added to the model, making it a better measure
for comparing models with different numbers of predictors.

6. Mean Absolute Percentage Error (MAPE)

● Definition: MAPE measures the percentage difference between actual and

predicted values. It's useful for comparing models across different
datasets with different scales.
● Formula:

● Interpretation: MAPE provides an intuitive percentage-based error, which

is easy to interpret. However, it is sensitive to small actual values and
may become undefined when actual values are zero.
7. Explained Variance Score

● Definition: This metric measures how much of the variance in the

dependent variable is explained by the model.
● Formula:

● Interpretation: A higher explained variance indicates that the model

explains more of the variation in the target variable. A score of 1 means
perfect prediction, and a score of 0 means the model explains none of the
variance.

8. F-statistic (ANOVA F-test)

● Definition: The F-statistic tests the overall significance of the regression

model. It checks whether the model is a good fit for the data.
● Interpretation: A higher F-statistic suggests that the model explains a
significant amount of variability in the target variable compared to the
residuals (unexplained variance).

9. Heteroscedasticity

● Definition: This refers to the assumption that the variance of the errors is
constant across all levels of the independent variable(s).
● Test: Common tests like the Breusch-Pagan test or White's test are used to
detect heteroscedasticity.
● Interpretation: If heteroscedasticity is present, it means that the model's
error variance is not constant, which can lead to inefficient estimates of
model parameters. It's important to address this for accurate predictions
and inference.

10. Residual Plots

● Definition: Residual plots are graphs that show the difference between the
actual and predicted values (residuals) against fitted values or predictor
values.
● Interpretation: In a well-fitted regression model, residuals should appear
randomly scattered with no obvious patterns. If there are patterns, this
may indicate that the model is not capturing important trends in the data.
Q.6: Enlist and explain performance metrics of Classifier.
Ans:

1. Accuracy

● Definition: Accuracy is the proportion of correctly predicted instances

(both true positives and true negatives) out of all instances.
● Formula:

where:
○ TP = True Positives
○ TN = True Negatives
○ FP = False Positives
○ FN = False Negatives
● Interpretation: Accuracy is a straightforward metric that is most useful
when the classes are balanced. However, it may be misleading in
imbalanced datasets, as it can be high even if the model performs poorly
on the minority class.

2. Precision

● Definition: Precision (also called Positive Predictive Value) measures the

proportion of positive predictions that are actually correct.
● Formula:

● Interpretation: Precision tells us how many of the predicted positive

instances were truly positive. It is particularly useful when the cost of
false positives is high (e.g., spam detection, fraud detection).

3. Recall (Sensitivity or True Positive Rate)

● Definition: Recall (or Sensitivity) measures the proportion of actual

positive instances that are correctly identified by the model.
● Formula:

● Interpretation: Recall is important when the cost of false negatives is

high (e.g., in medical diagnostics where missing a positive case could be
dangerous). A high recall means that the model captures most of the
actual positive instances.
4. F1-Score

● Definition: The F1-score is the harmonic mean of precision and recall,

providing a balance between the two metrics.
● Formula:

● Interpretation: F1-score is useful when you need a balance between

precision and recall, particularly in imbalanced datasets where both false
positives and false negatives are costly. It ranges from 0 (worst) to 1
(best).

5. Specificity (True Negative Rate)

● Definition: Specificity measures the proportion of actual negative

instances that are correctly identified by the model.
● Formula:

● Interpretation: Specificity is useful in contexts where correctly

identifying the negative class is important. It complements recall by
focusing on how well the model avoids false positives.

Unit 2
No ratings yet
Unit 2
19 pages
Unit 3 Da
No ratings yet
Unit 3 Da
20 pages
Linear Regression Notes Extended
No ratings yet
Linear Regression Notes Extended
3 pages
What Is Linear Regression
No ratings yet
What Is Linear Regression
14 pages
Complete
No ratings yet
Complete
12 pages
Linear Regression
No ratings yet
Linear Regression
16 pages
Regression Notes
No ratings yet
Regression Notes
7 pages
Model Development
No ratings yet
Model Development
80 pages
U3 U4 Regression
No ratings yet
U3 U4 Regression
22 pages
BA3 4 5modules
No ratings yet
BA3 4 5modules
258 pages
ML Exp3
No ratings yet
ML Exp3
10 pages
Unit 2 Topic 1 REGRESSION
No ratings yet
Unit 2 Topic 1 REGRESSION
19 pages
Regression in M.L
No ratings yet
Regression in M.L
13 pages
Regression
No ratings yet
Regression
6 pages
Data Analytics Unit 2
No ratings yet
Data Analytics Unit 2
13 pages
Unit 2questionbank-1
No ratings yet
Unit 2questionbank-1
38 pages
Unit 2 ML
No ratings yet
Unit 2 ML
26 pages
Unit-Iii-1 1
No ratings yet
Unit-Iii-1 1
31 pages
Linear Regression
No ratings yet
Linear Regression
4 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
5 pages
Regression Analysis Guide
No ratings yet
Regression Analysis Guide
13 pages
Regression Analysis (AI)
No ratings yet
Regression Analysis (AI)
9 pages
Teit ML2
No ratings yet
Teit ML2
11 pages
Applying Machine Learning Algorithms With Scikit-Learn (Sklearn) - Notes
No ratings yet
Applying Machine Learning Algorithms With Scikit-Learn (Sklearn) - Notes
19 pages
Linear Regression
No ratings yet
Linear Regression
38 pages
Linear Regression
No ratings yet
Linear Regression
5 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Linear Regression Guide & Assumptions
No ratings yet
Linear Regression Guide & Assumptions
9 pages
Unit Iii Da
No ratings yet
Unit Iii Da
46 pages
Unit-2 ML
No ratings yet
Unit-2 ML
39 pages
Group 1 Practical
No ratings yet
Group 1 Practical
16 pages
Hanan
No ratings yet
Hanan
9 pages
1.5.linear Regression
No ratings yet
1.5.linear Regression
5 pages
Dimpas Bscpe 2-7 Assignment No.9
No ratings yet
Dimpas Bscpe 2-7 Assignment No.9
17 pages
Unit 2
No ratings yet
Unit 2
18 pages
ML PR-2
No ratings yet
ML PR-2
11 pages
Regression Notes
No ratings yet
Regression Notes
6 pages
Unit 3
No ratings yet
Unit 3
25 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
Linear Regression Essentials
No ratings yet
Linear Regression Essentials
6 pages
Unit - II - DA
No ratings yet
Unit - II - DA
22 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
Regression Analysis Linear Multiple Logistic
No ratings yet
Regression Analysis Linear Multiple Logistic
25 pages
Regression Modelling
No ratings yet
Regression Modelling
25 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
Unit 2
No ratings yet
Unit 2
26 pages
Module 2
No ratings yet
Module 2
21 pages
Lecture 9-10
No ratings yet
Lecture 9-10
28 pages
Introduction To Regression Analysis
No ratings yet
Introduction To Regression Analysis
13 pages
Module 3
No ratings yet
Module 3
34 pages
Unit-2 Regression-Supervised Machine Learning
No ratings yet
Unit-2 Regression-Supervised Machine Learning
90 pages
Assignment Group C
No ratings yet
Assignment Group C
8 pages
Understanding Regression
No ratings yet
Understanding Regression
40 pages
Da Unit-3
No ratings yet
Da Unit-3
27 pages
Simple Linear and Logistic Regression
No ratings yet
Simple Linear and Logistic Regression
81 pages
Unit 2
No ratings yet
Unit 2
67 pages
4 ML
No ratings yet
4 ML
41 pages
BSC Syllabus
No ratings yet
BSC Syllabus
13 pages
BMS Syllabus
No ratings yet
BMS Syllabus
9 pages
Bee CHPT3
No ratings yet
Bee CHPT3
2 pages
Unit I BEE
No ratings yet
Unit I BEE
39 pages
MCQ On Transducer
No ratings yet
MCQ On Transducer
5 pages
Imo 2015 SL
No ratings yet
Imo 2015 SL
12 pages
History 123
No ratings yet
History 123
1 page
DurlonTechSheets - 2021 - 06 - FG Gasket
No ratings yet
DurlonTechSheets - 2021 - 06 - FG Gasket
2 pages
Tooth Color Distribution Study
No ratings yet
Tooth Color Distribution Study
7 pages
Mcps Admission
No ratings yet
Mcps Admission
3 pages
Account Statement From 1 Oct 2024 To 20 Mar 2025: TXN Date Value Date Description Ref No./Cheque No. Debit Credit Balance
No ratings yet
Account Statement From 1 Oct 2024 To 20 Mar 2025: TXN Date Value Date Description Ref No./Cheque No. Debit Credit Balance
1 page
Brilliance V5 Gearbox Repair Guide
100% (1)
Brilliance V5 Gearbox Repair Guide
10 pages
Amerlock 400gfa
No ratings yet
Amerlock 400gfa
4 pages
Metal Crystallization and Grain Size
No ratings yet
Metal Crystallization and Grain Size
33 pages
Fujitsu PRIMERGY RX2540 M6 Rack Server: Data Sheet
No ratings yet
Fujitsu PRIMERGY RX2540 M6 Rack Server: Data Sheet
15 pages
Secretariat-Commentary - Article - 1 - CISG 2
No ratings yet
Secretariat-Commentary - Article - 1 - CISG 2
2 pages
Group 5 - Ibdp Maa - Mai Handbook 2023-2025
No ratings yet
Group 5 - Ibdp Maa - Mai Handbook 2023-2025
268 pages
8 Implementation of IOT Based Attendance Management System On Raspberry Pi
No ratings yet
8 Implementation of IOT Based Attendance Management System On Raspberry Pi
4 pages
Swift PGA Filing Quick Referencer
No ratings yet
Swift PGA Filing Quick Referencer
27 pages
Course Book Answers For Cambridge International As A Level Chemistry Coursebook
87% (15)
Course Book Answers For Cambridge International As A Level Chemistry Coursebook
145 pages
JICA Report Initial Environmental Assessment
No ratings yet
JICA Report Initial Environmental Assessment
42 pages
Chapter 14
No ratings yet
Chapter 14
27 pages
HORNASAN ES SIP Design Template 2022 2025 Revised
100% (1)
HORNASAN ES SIP Design Template 2022 2025 Revised
41 pages
ICCS 2025 Paper 346
No ratings yet
ICCS 2025 Paper 346
13 pages
Star Life Cycles Explained
No ratings yet
Star Life Cycles Explained
6 pages
World of Knowledge - May 2016 VK Com Englishmagazines PDF
No ratings yet
World of Knowledge - May 2016 VK Com Englishmagazines PDF
100 pages
Fault Code List For Base Module (GM) Control Unit 2 Car Body Styles Wheeled Vehicles PDF
No ratings yet
Fault Code List For Base Module (GM) Control Unit 2 Car Body Styles Wheeled Vehicles PDF
1 page
Low Energy Emulsification
No ratings yet
Low Energy Emulsification
5 pages
Physical Sciences Paper 1 Winter Classes 2018
No ratings yet
Physical Sciences Paper 1 Winter Classes 2018
50 pages
Reliability - Engineering - Syed Nadeem Ahmed Jul16
No ratings yet
Reliability - Engineering - Syed Nadeem Ahmed Jul16
9 pages
Anonymous DNS Guide
No ratings yet
Anonymous DNS Guide
4 pages
Alpha 20 Training Material Small File
100% (3)
Alpha 20 Training Material Small File
68 pages
Fsie Module Lesson 4
No ratings yet
Fsie Module Lesson 4
20 pages
Gramática 7
No ratings yet
Gramática 7
1 page
ABF Skinny Mini Handout PDF
No ratings yet
ABF Skinny Mini Handout PDF
6 pages

AIML Question Ans Part1

Uploaded by

AIML Question Ans Part1

Uploaded by

Name: Khan Adil Parvez

Q.1: Explain about Linear Regression along with example?

Linear Regression Example

Sales = 5,000 + 20 × 50 = 105,000.

Q.2: What are the assumptions of linear regression?

● Assumption: There is a linear relationship between the independent

● Assumption: The residuals (errors) are independent of each other.

● Assumption: The variance of the residuals (errors) is constant across all

● Assumption: The residuals (errors) are normally distributed.

5. No Multicollinearity (for multiple linear regression):

6. No Measurement Error in Independent Variables:

● Assumption: The independent variables are measured accurately with no

Q.3: Explain about Logistic Regression with example.

Q.4: What are the assumption of Logistic Regression

However, some other assumptions still apply.

Second, logistic regression requires the observations to be independent of each

Third, logistic regression requires there to be little or no multicollinearity

Fourth, logistic regression assumes linearity of independent variables and log

Finally, logistic regression typically requires a large sample size. A general

In regression analysis, various performance metrics are used to evaluate how

1. Mean Absolute Error (MAE)

● Definition: MAE is the average of the absolute differences between the

2. Mean Squared Error (MSE)

● Definition: MSE is the average of the squared differences between the

3. Root Mean Squared Error (RMSE)

● Definition: R-squared measures the proportion of the variance in the

Where, yˉ\bar{y} is the mean of the actual values.

● Definition: Adjusted R-squared is a modified version of R-squared that

Where, n is the number of data points, and p is the number of predictors.

6. Mean Absolute Percentage Error (MAPE)

● Definition: MAPE measures the percentage difference between actual and

● Interpretation: MAPE provides an intuitive percentage-based error, which

● Definition: This metric measures how much of the variance in the

● Interpretation: A higher explained variance indicates that the model

8. F-statistic (ANOVA F-test)

● Definition: The F-statistic tests the overall significance of the regression

10. Residual Plots

● Definition: Accuracy is the proportion of correctly predicted instances

● Definition: Precision (also called Positive Predictive Value) measures the

● Interpretation: Precision tells us how many of the predicted positive

3. Recall (Sensitivity or True Positive Rate)

● Definition: Recall (or Sensitivity) measures the proportion of actual

● Interpretation: Recall is important when the cost of false negatives is

● Definition: The F1-score is the harmonic mean of precision and recall,

● Interpretation: F1-score is useful when you need a balance between

5. Specificity (True Negative Rate)

● Definition: Specificity measures the proportion of actual negative

● Interpretation: Specificity is useful in contexts where correctly

You might also like