Question Bank
Module 3, 4 ,5
1. What is regression? Explain different types of regression and also explain their importance.
2. What is linear regression? Give the assumptions made for linear regression.
3. What is ordinary least square approach for Linear Regression?
4. A company collected data on the number of hours its employees spent on training and their
performance scores on a test. The data for 5 employees is given below:
Hours of Training (X) Performance Score (Y)
2 81
4 93
6 91
8 97
10 109
Tasks:
a. Fit a simple linear regression line of the form Y=a+bXY = a + bXY=a+bX using the least squares
method.
b. Predict the performance score for an employee who underwent 7 hours of training.
c. Calculate the coefficient of determination R2R^2R2 to evaluate the model's goodness of fit
5. Give different validation methods for regression.
6. What is Locally weighted regression? Step-wise explain LWR method.
7. Given the following dataset:
X Y
1 1.2
2 1.9
3 3.0
4 3.9
5 5.1
You are to perform Locally Weighted Linear Regression to predict the value of Y
when X = 3. Consider the hypothesis model is linear, use the Gaussian kernel for weights
and bandwidth parameter τ= 1.0.
8. A company collected data on employees to predict their monthly salary (Y) based on two
variables: Years of Experience (X₁)& Number of Projects Completed (X₂)
The data from 5 employees is as follows:
Years of Experience Projects Completed Salary (Y) in
Employee (X₁) (X₂) $1000
A 1 1 40
B 2 3 50
C 3 2 55
D 4 4 65
E 5 5 70
Fit a multiple linear regression model of the form:
Y=β0+β1X1+β2X2 . Predict the salary for an employee with:
3 years of experience, 4 completed projects and interpret the coefficients β0, β1and β2 in the
context of the problem.
9. What is Logistic Regression? Is it a classifier or regressor? Justify your answer.
10. Explain the logistic regression with an example.
11. What is decision tree classifier. Give the general decision tree algorithm.
12. Compare between C4.5 and ID3 Decision trees.
13. Consider the following dataset-
Experience Education Performance Score Promotion
1 High School 4.0 No
3 Bachelors 6.0 No
5 Bachelors 7.0 Yes
7 Masters 8.0 Yes
2 High School 5.0 No
4 Masters 6.5 Yes
6 Bachelors 7.5 Yes
3 High School 4.5 No
a. Using C4.5 build the decision tree that can classify whether a new employee will be
promoted, based on their experience, education, and performance score.
b. Can we use ID3 algo to build decision tree for above dataset. If yes why , if no Why? Justify.
c. Classify whether an employee having experience of 4 years, education as Bachelore and
performance score as 6.8 can be promoted or not.
14. What are the steps and assumption for Naïve Baye’s algorithm. Explain clearly.
15. Differentiate between Bayesian learning and Probabilistic learning .
16. How does Laplace smoothing help prevent zero probabilities?
17. Consider the following dataset-
Patient Fever Cough Disease
A Yes Yes Yes
B Yes No Yes
C No Yes No
D No No No
What is the probability that a new patient with Fever = Yes and Cough = Yes has the disease?
18. Consider the following dataset-
Day Outlook Temperature Play Tennis
1 Sunny Hot No
2 Sunny Cool No
3 Overcast Hot Yes
4 Rainy Cool Yes
5 Sunny Mild Yes
Using Naïve Bayes classifier, tell whether tennis can be played or not if Outlook is Rainy and
Temperature is Mild.
19. You have a dataset of students with one feature: their study hours (a continuous value), and
whether they passed or failed the exam.
Student Study Hours (X) Result (Y)
A 1.0 Fail
B 2.0 Fail
C 1.5 Fail
D 4.0 Pass
E 5.5 Pass
F 6.0 Pass
a. Estimate the prior probabilities: P(Pass) and P(Fail)
b. For the continuous attribute (Study Hours), assume it follows a Gaussian (normal)
distribution. Compute the mean (μ) and standard deviation (σ) of study hours for each class
(Pass, Fail).
c. For a new student who studies 3.0 hours, compute:
P(Study Hours=3.0∣Pass)
P(Study Hours=3.0∣Fail)
Use Naive Bayes to compute:
P(Pass∣X=3.0) and P(Fail∣X=3.0)
Based on your result, predict whether the student is more likely to Pass or Fail.
20. What is Artificial Neural Network. Explain the basic perceptron algorithm.
21. What is the role of activation function in ANN. Explain different activation functions.
22. Explain the learning process in Multilayer Perceptron model.
23. What are feedforward neural networks?
24. Differentiate between clustering and classification
25. What are the properties of distance metrices . Explain various distance metrices with their
importance.
26. Explain K-means clustering algorithm with a suitable example.
27. Explain density based clustering algorithm and DBScan.
28. Explain different Grid Based clustering approaches.
29. Explain EM algorithm.
30. How will you Evaluate your clustering methods? Explain them.