Department Of Computer Science & Engineering (Data Science)
210651 : Course : MATHEMATICAL FOUNDATION FOR DATA SCIENCE-II
Course Outcomes:
On completion of the course, learner will be able to–
CO1 Understand solving multivariable calculus problems.
Apply real analysis concepts to analyse the convergence and continuity of functions used in
CO2
data science models, ensuring robustness and reliability of analytical results.
Implement linear regression models to analyse relationships between variables, interpret
CO3
model coefficients, and make accurate predictions based on real-world data sets.
Utilize time series analysis techniques to identify patterns, trends, and anomalies in
CO4
sequential data, and develop predictive models for time-dependent phenomena
Apply machine learning algorithms to classify data, cluster similar data points, and extract
CO5
meaningful insights from large and complex data sets in various domains.
Use optimization techniques to fine-tune machine learning models, minimize cost
CO6 functions, and optimize model parameters for improved performance and efficiency in data
analysis tasks.
Bloom’s Levels :
Bloom’s Level
1 Remembering Basic recall of facts & concepts.
2 Understanding Explanation of ideas or concepts.
3 Applying Use of information in new situations.
4 Analyzing Drawing connections among ideas.
5 Evaluating Justifying a decision or course of action.
6 Creating Producing new or original work
QUESTION BANK
Q.No. Question CO BL Marks
Unit III Linear Regression
1 Note on “Simple linear regression & Multiple linear regression” CO3 3 5
2 Write the difference between Simple Linear Regression & Multiple
Linear Regression. CO3 3 5
3 Explain Residual analysis & Residual plot with example CO3 3 5
4 What is the importance of variance inflation factor in multiple regression
CO3 2 5
model?
5 Define and explain Sum of Squared Error (SSE), Mean Squared Error (MSE)
and Mean Absolute Error (MAE) w.r.t. regression CO3 2 5
6 Give the applications of linear regression in data science with examples
CO3 3 5
7 Explain any three Variable selection techniques in statistics? CO3 4 5
8 For following datasets of Score of students & their studied hours, find
residuals and plot a residual plot.Comment on the plot for model fit.
Hours 2 4 5 6 7 8 10 CO3 5 5
Studied
Score 52 58 64 68 78 90 98
9 Obtain coefficient of correlation given the table of values
x 6 2 10 4 8 CO3 5 5
y 9 11 5 8 7
10 Compute the regression lines for the following data
x 10 14 19 26 30 CO3 5 5
y 12 16 18 26 29
11 Compute the regression lines for the following data
x 6 2 10 4 8
y 9 11 5 8 7 CO3 5 5
And estimate 𝑦 𝑓𝑜𝑟 𝑥 = 5
12 Fit the line 𝑦 = 𝑚𝑥 + 𝑐 using least square method in simple linear
regression
x 0 1 2 3 4 5 6 7 CO3 5 5
y -5 -3 -1 1 3 5 7 19
13 Fit a parabola 𝑦 = 𝑎𝑥 2 + 𝑏𝑥 + 𝑐 for following data in multiple linear
regression
CO3 5 5
x -3 -2 -1 0 1 2 3
y 12 14 1 2 7 15 30
14 Fit the line 𝑦 = 𝑎𝑥 + 𝑏 using least square method in simple linear
regression
CO3 5 5
x 0 1 2 3 4 5 6 7
y 2 -3 4 -4 3 9 11 12
2
15 Fit a parabola 𝑦 = 𝑎𝑥 + 𝑏𝑥 + 𝑐 for following data in multiple linear
regression
CO3 5 5
x -3 -2 -1 0 1 2 3
y 8 4 1 2 8 10 15
Unit IV: Time Series Analysis
1 Define time series analysis. Why it is important in data science? CO4 2 5
2 Explain the components of a time series. How do trend, seasonality,
cyclic variations, and irregular variations impact time series CO4 3 5
forecasting?
3 Define Time Series & explain its types based on trends, patterns &
seasonality? CO4 2 5
4 Define Time Series & explain its types based on behavior of data,
presence of noise & frequency of data collection? CO4 3 5
5 Explain with example stationarity & autocorrelation in time series CO4 3 5
6 Differentiate between stationary and non-stationary time series. Why is
stationarity important in time series modeling? CO4 2 5
7 Explain the concept of seasonality in time series analysis. Provide an
example where seasonal decomposition is necessary. CO4 3 5
8 Give a brief note on ARIMA model CO4 4 5
9 What is the Autoregressive Integrated Moving Average (ARIMA)
model? Describe its components and usage in time series forecasting. CO4 3 5
10 Explain applications of time series in data science with the help of an
examples CO4 3 5
11 Explain Forecasting Techniques in Time Series Analysis CO4 4 5
12 For following sales data for the last 6 months, forecast/evaluate the
sales of month July by using ARIMA(1,1,1)
Month Jan Feb March April May June CO4 5 5
Sales 120 135 150 160 175 190
UNIT –V: Machine Learning Foundations
1 Explain the types of machine learning with examples CO5 2 5
2 Define Supervised Learning and explain its applications with suitable
examples CO5 3 5
3 Differentiate between Classification & regression problems in context
of supervised learning CO5 2 5
4 Exaplain any three clustering algorithms with examples CO5 2 5
5 Exaplain Bias-Variance Tradeoff in Machine Learning with suitatble
example CO5 3 5
6
Explain the applications of Machine Learning in data science CO5 3 5
7 What is the K-Nearest Neighbors (KNN) algorithm? Describe its
algorithmic steps with an illustration. CO5 3 5
8 Compare decision trees with other supervised learning models. What
are the advantages and limitations of decision trees? CO5 3 5
9 Explain the training and prediction phases of a supervised learning
model. How is model accuracy evaluated? CO5 5 5
10 Design a machine learning workflow using a classification problem.
Highlight each stage and the decisions involved. CO5 5 5
Unit VI : Optimization Techniques
1 Explain the types of optimization in machine learning & data science. CO6 3 5
2 Explain convex optimization, covex set and by using it find Minima of
f(x) = x 2 + 3x + 4 CO6 5 5
3 Explain Gradient Descent method with an example CO6 5 5
4 By using Gradient Descent method , find first five iterations for
f(x) = x 2 + 4x + 4 for dataset (0,5) and learing rate is 0.1 CO6 5 5
5 Explain Constrained Optimization techniuqes CO6 3 5
6 Explain the concept of the Lagrange multiplier method. How is it
CO6 5 5
useful in optimization problems with constraints.
7
Find minumum value of f(x, y) = x 2 + y 2 subject to condition x + y = CO6 5 5
1 by using Method of Lagrange’s Multipliers.
8
Explain any five Meta-Heuristic Optimization Algorithms with CO6 3 5
example.
9 CO6 3 5
Write Applications of Optimization Techniques in Data Scienc
10
Justify the use of penalty methods in constrained optimization CO6 4 5
problems and compare it with barrier methods
11
Design a scenario in which multi-objective optimization is required in CO6 4 5
data science and explain how Pareto-optimality is applied?