UNIT 1: PYTHON PROGRAMMING-II
1. Which of these represents the correct way to import NumPy?
A) import numpy
B) import numpy as np
C) import np as numpy
D) from numpy import *
2. What does NumPy primarily provide in Python?
A) Data visualization tools
B) Multi-dimensional arrays and high-speed operations
C) Natural Language Processing features
D) Web development framework
3. What is the main advantage of NumPy arrays over Python lists for
numerical operations?
A) They support string manipulation
B) They operate faster due to optimized memory and vectorized operations
C) They can hold multiple data types in the same array
D) They are immutable
4. What is the output shape of np.array([[1,2,3],[4,5,6]])?
A) (3,2)
B) (2,3)
C) (6,)
D) (2,2)
5. Which operation in NumPy applies element-wise?
A) Matrix inversion
B) Matrix multiplication only
C) Element-wise addition and multiplication using arrays
D) None of the above
6. Which method converts a Python list into a NumPy array?
A) np.list()
B) np.convert()
C) np.array()
D) np.make_array()
7. In NumPy, what is arr.dtype used for?
A) To get the number of dimensions
B) To know the data type of elements in the array
C) To reshape an array
D) To get array length
8. What does arr.shape return in NumPy?
A) The datatype
B) Size of the array
C) The dimensions of the array as a tuple
D) Total number of elements
9. What is the output of np.zeros((2,3))?
A) Array of ones with shape (2,3)
B) Array of zeros with shape (2,3)
C) Array of random numbers
D) Error
10. To get a flattened version of an array arr, which method is
appropriate?
A) arr.flatten()
B) arr.reshape()
C) arr.flat()
D) arr.expand()
11–20: Pandas Foundations
11. What is a Series in Pandas?
A) A 2-dimensional labeled data structure
B) A 1-dimensional labeled array
C) A function in Pandas
D) An unordered collection
12. What is a DataFrame in Pandas?
A) A 1-dimensional labeled array
B) A 2-dimensional labeled data structure with columns
C) A visualization object
D) A type of Python dictionary
13. How do you create a Pandas Series from a list named lst?
A) pd.Series(lst)
B) pd.DataFrame(lst)
C) pd.Array(lst)
D) pd.Series.create(lst)
14. To import a CSV file into a DataFrame named df?
A) pd.load_csv('file.csv')
B) pd.read_csv('file.csv')
C) df.read_csv('file.csv')
D) np.read('file.csv')
15. To export DataFrame df to a CSV named out.csv?
A) df.to_csv('out.csv')
B) pd.to_csv(df, 'out.csv')
C) df.save('out.csv')
D) df.write_csv()
16. What does df.head() do?
A) Shows last 5 rows
B) Shows first 5 rows
C) Returns column names
D) Displays summary statistics
17. To view DataFrame summary stats like mean and std dev?
A) df.summary()
B) df.info()
C) df.describe()
D) df.stats()
18. To rename a column from 'old' to 'new' in df?
A) df.rename(columns={'old':'new'}, inplace=True)
B) df.change('old','new')
C) df.columns = 'new'
D) df.rename_column(old=new)
19. Which method helps read only specific columns from a CSV?
A) usecols= parameter in read_csv
B) df.select_cols()
C) df.columns(['col1','col2'])
D) None of the above
20. Setting a specific column as index in a CSV import?
A) pd.read_csv('f.csv', index='col')
B) pd.read_csv('f.csv', index_col='col')
C) df.set_index('col')
D) Correct only: B is accurate
21–30: Handling Missing Data & DataFrame Operations
21. Which Pandas function identifies missing values?
A) df.missing()
B) df.isnull()
C) df.nan()
D) df.null()
22. To drop rows with missing values?
A) df.dropna()
B) df.remove_na()
C) df.na.drop()
D) df.clean()
23. To fill missing values with zero?
A) df.fillna(0)
B) df.na(0)
C) df.replace(None, 0)
D) df.zero()
24. How to fill missing values in column 'age' with mean?
A) df['age'].fillna(df['age'].mean(), inplace=True)
B) df.fillna('age', mean)
C) df['age'].mean_fill()
D) df.mean('age').fill()
25. How to drop a column named 'col'?
A) df.drop(columns=['col'])
B) df.remove('col')
C) df.del('col')
D) del df['col'] (also valid in pandas)
26. Add a new column 'new' calculated from 'col1' + col2'?
A) df['new'] = df['col1'] + df['col2']
B) df.new = df.col1 + df.col2
C) df.add_column('new', df['col1'] + df['col2'])
D) Only A is correct
27. To sort DataFrame df by a column 'score' descending?
A) df.sort('score', ascending=False)
B) df.sort_values(by='score', ascending=False)
C) df.order_by('score', desc=True)
D) df.sort_desc('score')
28. How to reset an index of DataFrame?
A) df.reset_index(drop=True)
B) df.reset()
C) df.set_index(None)
D) df.reindex()
29. Which method returns data types of each column?
A) df.info()
B) df.types()
C) df.dtypes
D) Both A and C work (df.dtypes shows column types; info() includes types)
30. Conversion of a column 'col' to integer type?
A) df['col'].astype(int)
B) df['col'].convert(int)
C) df.astype('col', int)
D) df.typecast('col', int)
31–40: Linear Regression (Advanced Learners)
31. What is the simplest form of regression in ML?
A) Logistic regression
B) Linear regression
C) Polynomial regression
D) None of these
32. In linear regression with 1 feature, the equation y = a x + b
denotes:
A) y is feature, x is target
B) x is predictor, y is response; a = slope, b = intercept
C) a is intercept, b is slope
D) None of these
33. Which Python library can implement linear regression?
A) NumPy
B) Pandas
C) Scikit-learn
D) Matplotlib
34. How to split data into training and testing sets?
A) Manual slicing
B) train_test_split from Scikit-learn
C) Pandas df.split()
D) NumPy np.split()
35. In Scikit-learn, which class is used for linear regression?
A) LinearModel
B) LinearRegression
C) Regression
D) LinearFit
36. After training a linear regression model model, how to make
predictions on X_test?
A) model.predict(X_test)
B) model.predict()
C) model.predict(X_train)
D) predict(model, X_test)
37. What metric calculates average squared error between actual
and predicted?
A) Mean Squared Error (MSE)
B) Root Mean Squared Error (RMSE)
C) Both A and B
D) Mean Absolute Error (MAE)
38. Which gives error in same units as target variable?
A) MSE
B) RMSE
C) Both
D) None
39. Which library provides functions for MSE and RMSE?
A) NumPy
B) Pandas
C) Scikit-learn (mean_squared_error)
D) Matplotlib
40. Why use a train-test split in regression?
A) To test model on unseen data and judge performance
B) To increase accuracy artificially
C) To reduce dataset size
D) It's optional and not helpful
ANSWER KEY:
1–10: NumPy Basics and Arrays
1. B) import numpy as np
2. B) Multi-dimensional arrays and high-speed operations
3. B) They operate faster due to optimized memory and vectorized
operations
4. B) (2,3)
5. C) Element-wise addition and multiplication using arrays
6. C) np.array()
7. B) To know the data type of elements in the array
8. C) The dimensions of the array as a tuple
9. B) Array of zeros with shape (2,3)
10. A) arr.flatten()
11–20: Pandas Foundations
11. B) A 1-dimensional labeled array
12. B) A 2-dimensional labeled data structure with columns
13. A) pd.Series(lst)
14. B) pd.read_csv('file.csv')
15. A) df.to_csv('out.csv')
16. B) Shows first 5 rows
17. C) df.describe()
18. A) df.rename(columns={'old':'new'}, inplace=True)
19. A) usecols= parameter in read_csv
20. B) pd.read_csv('f.csv', index_col='col')
21–30: Handling Missing Data & DataFrame Operations
21. B) df.isnull()
22. A) df.dropna()
23. A) df.fillna(0)
24. A) df['age'].fillna(df['age'].mean(), inplace=True)
25. A) df.drop(columns=['col']) (D also works but A is standard)
26. A) df['new'] = df['col1'] + df['col2']
27. B) df.sort_values(by='score', ascending=False)
28. A) df.reset_index(drop=True)
29. D) Both A and C work (df.dtypes shows column types; info()
includes types)
30. A) df['col'].astype(int)
31–40: Linear Regression (Advanced Learners)
31. B) Linear regression
32. B) x is predictor, y is response; a = slope, b = intercept
33. C) Scikit-learn
34. B) train_test_split from Scikit-learn
35. B) LinearRegression
36. A) model.predict(X_test)
37. A) Mean Squared Error (MSE)
38. B) RMSE
39. C) Scikit-learn (mean_squared_error)
40. A) To test model on unseen data and judge performance
DESCRIPTIVE QUESTIONS AND ANSWERS:
1. Create DataFrame from List of Lists
import pandas as pd
data = [
[101, "Amit", 87],
[102, "Riya", 92],
[103, "Kiran", 78]
df = pd.DataFrame(data, columns=["RollNo", "Name", "Marks"])
print("DataFrame created from list of lists:\n", df)
2. Create DataFrame from Dictionary
import pandas as pd
data = {
"RollNo": [101, 102, 103],
"Name": ["Amit", "Riya", "Kiran"],
"Marks": [87, 92, 78]
}
df = pd.DataFrame(data)
print("DataFrame created from dictionary:\n", df)
3. Access Rows and Columns
import pandas as pd
df = pd.DataFrame({
"RollNo": [101, 102, 103],
"Name": ["Amit", "Riya", "Kiran"],
"Marks": [87, 92, 78]
})
# Access column
print("Marks column:\n", df["Marks"])
# Access row by index
print("Second row:\n", df.iloc[1])
4. Boolean Indexing
import pandas as pd
df = pd.DataFrame({
"Name": ["Amit", "Riya", "Kiran", "Neha"],
"Marks": [87, 92, 78, 85]
})
# Students with Marks > 80
high_marks = df[df["Marks"] > 80]
print("Students with Marks > 80:\n", high_marks)
5. Filter Multiple Conditions
import pandas as pd
df = pd.DataFrame({
"Name": ["Amit", "Riya", "Kiran", "Neha"],
"Marks": [87, 92, 78, 85],
"Age": [16, 17, 16, 17]
})
# Students with Marks>80 AND Age=17
filtered = df[(df["Marks"] > 80) & (df["Age"] == 17)]
print("Filtered DataFrame:\n", filtered)
6. Add New Column Based on Calculation
import pandas as pd
df = pd.DataFrame({
"Name": ["Amit", "Riya", "Kiran"],
"Marks1": [87, 92, 78],
"Marks2": [90, 88, 80]
})
# Add total marks column
df["Total"] = df["Marks1"] + df["Marks2"]
print("DataFrame with Total Marks:\n", df)
7. Import CSV and Display
import pandas as pd
# Assuming 'student.csv' has columns RollNo, Name, Marks
df = pd.read_csv("student.csv")
print("CSV DataFrame:\n", df)
8. Export DataFrame to CSV
import pandas as pd
df = pd.DataFrame({
"Name": ["Amit", "Riya", "Kiran"],
"Marks": [87, 92, 78]
})
df.to_csv("output.csv", index=False)
print("DataFrame exported to 'output.csv'")
9. Access Specific Rows and Columns
import pandas as pd
df = pd.DataFrame({
"RollNo": [101, 102, 103, 104],
"Name": ["Amit", "Riya", "Kiran", "Neha"],
"Marks": [87, 92, 78, 85]
})
# Access first two rows
print("First two rows:\n", df.head(2))
# Access specific columns
print("Name and Marks columns:\n", df[["Name", "Marks"]])
10.Create DataFrame from List of Dictionaries
import pandas as pd
data = [
{"RollNo": 101, "Name": "Amit", "Marks": 87},
{"RollNo": 102, "Name": "Riya", "Marks": 92},
{"RollNo": 103, "Name": "Kiran", "Marks": 78}
df = pd.DataFrame(data)
print("DataFrame from list of dictionaries:\n", df)
11. Append/Concatenate DataFrames
import pandas as pd
df1 = pd.DataFrame({
"Name": ["Amit", "Riya"],
"Marks": [87, 92]
})
df2 = pd.DataFrame({
"Name": ["Kiran", "Neha"],
"Marks": [78, 85]
})
# Append df2 to df1
df_appended = df1.append(df2, ignore_index=True)
print("Appended DataFrame:\n", df_appended)
12.Series Arithmetic Operations
import pandas as pd
s1 = pd.Series([10, 20, 30])
s2 = pd.Series([1, 2, 3])
print("Addition:", s1 + s2)
print("Subtraction:", s1 - s2)
print("Multiplication:", s1 * s2)
print("Division:", s1 / s2)
13. Access Rows Using iloc and loc
import pandas as pd
df = pd.DataFrame({
"RollNo": [101, 102, 103],
"Name": ["Amit", "Riya", "Kiran"],
"Marks": [87, 92, 78]
})
# Access row by position
print("First row using iloc:\n", df.iloc[0])
# Access row by index label (if set index)
df.set_index("RollNo", inplace=True)
print("Row with RollNo 102 using loc:\n", df.loc[102])
14. Boolean Indexing with Multiple Conditions
import pandas as pd
df = pd.DataFrame({
"Name": ["Amit", "Riya", "Kiran", "Neha"],
"Marks": [87, 92, 78, 85],
"Age": [16, 17, 16, 17]
})
# Marks > 80 AND Age=17
result = df[(df["Marks"] > 80) & (df["Age"] == 17)]
print("Filtered DataFrame:\n", result)
15. Adding a Calculated Column
import pandas as pd
df = pd.DataFrame({
"Name": ["Amit", "Riya", "Kiran"],
"Marks1": [87, 92, 78],
"Marks2": [90, 88, 80]
})
# Average marks column
df["Average"] = (df["Marks1"] + df["Marks2"]) / 2
print("DataFrame with Average:\n", df)
16. Access Specific Rows & Columns
import pandas as pd
df = pd.DataFrame({
"RollNo": [101, 102, 103, 104],
"Name": ["Amit", "Riya", "Kiran", "Neha"],
"Marks": [87, 92, 78, 85]
})
# Access first 2 rows
print("First 2 rows:\n", df.head(2))
# Access Name and Marks columns
print("Name and Marks:\n", df[["Name", "Marks"]])
UNIT 2: DATA SCIENCE METHODOLOGY: AN ANALYTIC APPROACH TO
CAPSTONE PROJECT
1. The first step in a Data Science project is:
A) Data Cleaning
B) Modeling
C) Problem Definition
D) Evaluation
2. A Data Science project is also called a:
A) Capstone Project
B) Programming task
C) Spreadsheet task
D) Web project
3. Which step comes after Problem Definition?
A) Analytic Approach
B) Deployment
C) Data Cleaning
D) Feedback
4. Data Science Methodology helps us to:
A) Play games
B) Follow a clear process to solve problems
C) Learn drawing
D) Make speeches
5. In the Analytic Approach stage, we decide:
A) Which game to play
B) Which method or model to use
C) Which teacher to ask
D) Which exam to write
6. Predicting marks of students is an example of:
A) Regression
B) Classification
C) Clustering
D) Random guessing
7. Identifying whether a mail is spam or not is an example of:
A) Regression
B) Classification
C) Clustering
D) Data cleaning
8. Grouping customers based on purchase habits is:
A) Classification
B) Regression
C) Clustering
D) Evaluation
9. A well-defined problem should be:
A) Clear and measurable
B) Vague
C) Random
D) Confusing
10. Success criteria means:
A) How we will measure success of our project
B) How much data we have
C) What coding language we use
D) What chart we draw
11–20: Data Requirements, Collection & Understanding
11. Data requirement depends on:
A) Problem statement
B) Favorite subject
C) Mobile app used
D) Random choice
12. Tables in databases are examples of:
A) Structured data
B) Unstructured data
C) Videos
D) Images
13. Social media posts are:
A) Structured data
B) Unstructured data
C) Tabular data
D) Numeric data
14. Collecting data by survey is called:
A) Primary data collection
B) Secondary data collection
C) Tertiary data collection
D) Optional
15. Using already published government census is:
A) Primary data
B) Secondary data
C) Experimental data
D) Random data
16. Checking whether data is complete and correct is:
A) Data Understanding
B) Data Cleaning
C) Deployment
D) Coding
17. Missing values, duplicates, and outliers are:
A) Data quality issues
B) Data visualizations
C) Problem statements
D) Success criteria
18. Calculating mean, median, and mode comes under:
A) Data Understanding
B) Deployment
C) Problem Definition
D) Feedback
19. Data visualization helps to:
A) Understand data better
B) Waste time
C) Confuse students
D) Avoid coding
20. Pie charts and bar graphs are used in:
A) Data Visualization
B) Model Training
C) Problem Definition
D) Deployment
21–30: Data Preparation & Modeling
21. Removing duplicates is part of:
A) Data Preparation
B) Problem Definition
C) Deployment
D) Feedback
22. Filling missing values is part of:
A) Data Preparation
B) Deployment
C) Problem Definition
D) Evaluation
23. Converting text labels into numbers is called:
A) Encoding
B) Cleaning
C) Plotting
D) Deployment
24. Splitting data into training and testing is done in:
A) Data Preparation
B) Deployment
C) Problem Definition
D) Feedback
25. The variable we want to predict is called:
A) Target
B) Feature
C) Input
D) Record
26. The input variables used to predict are called:
A) Features
B) Targets
C) Labels
D) Models
27. In supervised learning, we have:
A) Input and output labels
B) Only inputs
C) Only outputs
D) Random data
28. In clustering, data is grouped:
A) Without labels
B) With labels
C) With answers given
D) With teacher’s notes
29. Predicting rainfall in cm is an example of:
A) Regression
B) Classification
C) Clustering
D) Deployment
30. Predicting whether a student passes or fails is an example of:
A) Classification
B) Regression
C) Clustering
D) Data Cleaning
31–40: Evaluation, Deployment & Feedback
31. Accuracy is used for:
A) Classification models
B) Regression models
C) Clustering models
D) Data visualization
32. Mean Squared Error is used for:
A) Regression models
B) Classification models
C) Clustering models
D) None
33. Precision and Recall are for:
A) Classification problems
B) Regression problems
C) Data cleaning
D) Deployment
34. Cross-validation is used for:
A) Checking model performance
B) Collecting data
C) Drawing graphs
D) Defining problem
35. ROC curve is used in:
A) Classification models
B) Regression models
C) Clustering models
D) Data preparation
36. In deployment stage, the model is:
A) Used in real world
B) Deleted
C) Tested again
D) Ignored
37. Feedback stage helps in:
A) Improving the model
B) Stopping the project
C) Avoiding evaluation
D) Deleting data
38. Data Science project is considered complete when:
A) Model is deployed and working well
B) Data is collected
C) Problem is written
D) Graph is drawn
39. The last step in the Data Science Methodology is:
A) Feedback and Iteration
B) Modeling
C) Data Collection
D) Evaluation
40. A Capstone Project means:
A) A complete project applying all steps of Data Science Methodology
B) A single coding exercise
C) A survey form
D) A group discussion
ANSWER KEY:
1–10: Problem Definition & Analytic Approach
1. C) Problem Definition
2. A) Capstone Project
3. A) Analytic Approach
4. B) Follow a clear process to solve problems
5. B) Which method or model to use
6. A) Regression
7. B) Classification
8. C) Clustering
9. A) Clear and measurable
10. A) How we will measure success of our project
11–20: Data Requirements, Collection & Understanding
11. A) Problem statement
12. A) Structured data
13. B) Unstructured data
14. A) Primary data collection
15. B) Secondary data
16. A) Data Understanding
17. A) Data quality issues
18. A) Data Understanding
19. A) Understand data better
20. A) Data Visualization
21–30: Data Preparation & Modeling
21. A) Data Preparation
22. A) Data Preparation
23. A) Encoding
24. A) Data Preparation
25. A) Target
26. A) Features
27. A) Input and output labels
28. A) Without labels
29. A) Regression
30. A) Classification
31–40: Evaluation, Deployment & Feedback
31. A) Classification models
32. A) Regression models
33. A) Classification problems
34. A) Checking model performance
35. A) Classification models
36. A) Used in real world
37. A) Improving the model
38. A) Model is deployed and working well
39. A) Feedback and Iteration
40. A) A complete project applying all steps of Data Science
Methodology
DESCRIPTIVE QUESTIONS:
1. What are the main steps in the Data Science Methodology?
Explain briefly Main Steps of the Data Science Methodology
(Explained)
Big picture: It’s a repeatable, iterative cycle that moves from a real-world
problem to a working solution, and then improves it with feedback.
1) Business Understanding (Problem Scoping)
Key question: What problem are we solving and why?
What we do: Talk to stakeholders, use 5W1H and Design Thinking, define
goals, scope, success criteria, constraints.
Deliverable: Clear problem statement + success measure.
Mini example: A school wants to identify students who may need extra
help before finals (goal: reduce failures by 10%).
2) Analytic Approach
Key question: How can data answer this problem?
What we do: Map the problem to an analysis type—classification,
regression, clustering, anomaly detection, recommendation; choose
descriptive/diagnostic/predictive/prescriptive analytics.
Deliverable: Chosen approach (e.g., classification).
Mini example: “Pass/Fail” is classification.
3) Data Requirements
Key question: What data do we need?
What we do: Decide data types (numbers/text/images), format
(tables/CSV), granularity, time range, quality, ethics/privacy.
Deliverable: Data requirement/specification document.
Mini example: Needed features: attendance %, internal test scores,
homework completion, study hours.
4) Data Collection
Key question: Where will we get it from?
What we do: Gather data from primary (surveys, forms, sensors) and
secondary (databases, open data) sources.
Deliverable: Raw dataset + data dictionary.
Mini example: Pull records from the school MIS; add a short student survey
for study hours.
5) Data Understanding (Exploration)
Key question: Does the data represent the problem correctly?
What we do: Explore with statistics and visuals; check distributions,
correlations, missing values, outliers, data leakage.
Deliverable: EDA (Exploratory Data Analysis) summary with findings/issues.
Mini example: Notice many missing study hours for Grade 12;
attendance strongly correlates with outcomes.
6) Data Preparation (Cleaning & Feature Engineering)
Key question: How do we make data model-ready?
What we do:
Clean: fix typos, handle missing values, remove duplicates.
Transform: encode categories, scale numbers, derive new features.
Integrate: merge tables; split into train/validation/test.
Deliverable: Tidy, modelling-ready dataset. (Most time-consuming
step!)
Mini example: Create “term average”, “consistency score”;
impute missing study hours; one-hot encode streams (Sci/Comm/Arts).
7) Modelling
Key question: Which algorithm fits best?
What we do: Train candidate models (e.g., Decision Tree, Logistic
Regression), tune hyperparameters, use cross-validation.
Deliverable: One or more candidate models with training results.
Mini example: Train a Decision Tree to classify Pass/Fail.
8) Evaluation
Key question: Does the model answer the original question well?
What we do: Test on unseen data; use metrics (Accuracy, Precision, Recall,
F1 for classification; MAE/MSE/RMSE for regression). Check against business
success criteria (from Step 1).
Deliverable: Evaluation report + decision (proceed/adjust).
Mini example: Model gets F1 = 0.86; meets target (≥0.80). If not, revisit
features or approach.
9) Deployment
Key question: How does the user get the solution?
What we do: Put the model into use—web app, mobile app, dashboard, API;
decide batch vs real-time; add basic monitoring.
Deliverable: Working solution in the real environment.
Mini example: A simple teacher dashboard flags students needing
support each week.
10) Feedback (Monitoring & Improvement)
Key question: Is the problem solved? What can we improve?
What we do: Gather user feedback, track performance over time, detect
drift, retrain periodically, refine features and rules.
Deliverable: Updated model/process; continuous improvement loop.
Mini example: Teachers report some false alarms after vacations → adjust
the model to consider holiday weeks.
Q2. Explain the importance of Data Understanding and Data
Preparation in the Data Science Methodology with examples.
Answer:
After collecting data, it cannot be directly used for analysis because raw data
is often incomplete, inconsistent, or contains errors. Two important steps at
this stage are Data Understanding and Data Preparation.
1. Data Understanding
o In this step, the data collected is carefully studied.
o The aim is to know what kind of data is available, its format, size,
and quality.
o Example: If a school collects student marks data, we check how
many subjects are there, whether marks are stored as numbers,
and if any values are missing.
o This step helps in deciding whether the data is sufficient for
solving the problem.
2. Data Preparation
o Also called data cleaning or data preprocessing.
o It involves correcting errors, filling missing values, removing
duplicates, and converting data into a useful format.
o Example: In the student marks data, if some records have
“Absent” instead of marks, it must be handled properly (e.g.,
replaced with 0 or marked as missing).
o Without preparation, the model may give wrong or misleading
results.
3. Why these steps are important
o Good quality data leads to reliable results.
o Poor or unclean data leads to wrong conclusions, which can
affect decision-making.
o These steps ensure that the data is accurate, consistent, and
suitable for building models.
👉 Thus, Data Understanding and Preparation are the backbone of the
Data Science process, because “garbage in, garbage out” – if we feed
wrong data, we will get wrong results.
UNIT 3: MAKING MACHINES SEE
1–10: Basics of Computer Vision
1. What does "Computer Vision" mean?
A) Teaching machines to see and understand images
B) Teaching machines to write code
C) Teaching machines to listen to music
D) Teaching machines to play games
2. Which of the following is an example of Computer Vision?
A) Face recognition in mobile phones
B) Listening to songs
C) Reading novels
D) Sending emails
3. Which sense of humans does Computer Vision try to imitate?
A) Hearing
B) Vision
C) Touch
D) Smell
4. Which technology is used in self-driving cars to detect obstacles?
A) Computer Vision
B) Natural Language Processing
C) Cloud Computing
D) Virtual Reality
5. OCR stands for:
A) Optical Character Recognition
B) Object Character Recognition
C) Optical Camera Reading
D) Open Character Reading
6. Which device is commonly used to capture images for computer
vision?
A) Camera
B) Microphone
C) Keyboard
D) Speaker
7. In computer vision, an image is represented as:
A) Rows and columns of pixels
B) Blocks of sound waves
C) Pages of text
D) Series of commands
8. Each pixel in a grayscale image contains:
A) One intensity value
B) Three values (RGB)
C) Only 0 or 1
D) A text label
9. What do RGB stand for in images?
A) Red, Green, Blue
B) Read, Go, Black
C) Random, General, Binary
D) Range, Grid, Brightness
10. Which type of image uses only two colors: black and white?
A) Binary Image
B) Grayscale Image
C) RGB Image
D) Color Image
11–20: Image Types and Features
11. Which type of image has values between 0 to 255 for intensity?
A) Grayscale
B) Binary
C) RGB
D) None
12. A color image (RGB) has how many channels?
A) 3
B) 2
C) 1
D) 4
13. Which of the following is an application of Computer Vision?
A) Detecting diseases in X-ray images
B) Listening to music
C) Writing stories
D) Sending SMS
14. Facial recognition system in airports is an example of:
A) Computer Vision
B) Speech Recognition
C) Data Science
D) Robotics only
15. Number plate detection in traffic cameras is called:
A) Automatic Number Plate Recognition (ANPR)
B) Automatic Name Plate Reading
C) Auto Numeric Processing
D) Artificial Numeric Prediction
16. Which of these is NOT a step in computer vision?
A) Capturing image
B) Processing image
C) Extracting features
D) Listening to songs
17. Edge detection is used to:
A) Find boundaries in images
B) Increase sound quality
C) Write texts faster
D) Store data
18. Which tool/library is widely used in Python for Computer Vision?
A) OpenCV
B) NumPy only
C) MS Word
D) PowerPoint
19. In image processing, filtering is used to:
A) Remove noise
B) Write documents
C) Send emails
D) Play music
20. Convolution is a process mainly used in:
A) Image processing and deep learning
B) Emailing
C) Gaming only
D) Text editing
21–30: Deep Learning & Vision
21. CNN stands for:
A) Convolutional Neural Network
B) Central Neural Node
C) Computer Numeric Network
D) Common Neural Net
22. CNN is widely used for:
A) Image recognition
B) Music composition
C) Text writing
D) Weather reporting
23. In CNN, the convolution layer is used to:
A) Detect features in an image
B) Play games
C) Send messages
D) Translate languages
24. Pooling layer in CNN is used for:
A) Reducing size of feature maps
B) Sending emails
C) Increasing volume
D) Writing code
25. Which of these is NOT an application of Computer Vision?
A) Object Detection
B) Speech-to-Text conversion
C) Face Detection
D) Medical Imaging
26. Object detection means:
A) Identifying and locating objects in an image
B) Counting numbers
C) Detecting sound
D) Writing notes
27. Image classification means:
A) Assigning a label to an image
B) Assigning sound to an image
C) Breaking the image into small pieces
D) Removing pixels
28. Which of these is an everyday use of Computer Vision?
A) Google Lens
B) MS Paint
C) Notepad
D) Excel only
29. Which industry uses computer vision for quality control?
A) Manufacturing
B) Music
C) Literature
D) Tourism
30. In healthcare, computer vision can help in:
A) Detecting tumors in scans
B) Cooking food
C) Playing sports
D) Teaching languages
31–40: Advanced Applications & Ethics
31. Which company uses Computer Vision in self-driving cars?
A) Tesla
B) WhatsApp
C) Spotify
D) Twitter
32. Which social media app uses computer vision for automatic
photo tagging?
A) Facebook
B) WhatsApp
C) Telegram
D) Signal
33. Gesture recognition allows computers to:
A) Understand hand and body movements
B) Understand smell
C) Understand music
D) Understand novels
34. Which of these is a challenge in computer vision?
A) Poor image quality
B) Bright lighting always
C) Correct grammar
D) High volume
35. CAPTCHA test on websites uses:
A) Computer Vision & Pattern Recognition
B) Music recognition
C) Text summarization
D) Story writing
36. Which of these is NOT a concern in Computer Vision?
A) Privacy issues
B) Data security
C) Image accuracy
D) Cooking recipes
37. Which application uses Computer Vision in sports?
A) Tracking ball movement in cricket/football
B) Reading books aloud
C) Generating poems
D) Playing songs
38. Augmented Reality (AR) uses Computer Vision for:
A) Overlaying digital images on real world
B) Sending text messages
C) Cooking recipes
D) Playing audio
39. Which of the following is an example of biometric system using
Computer Vision?
A) Face unlock in smartphones
B) Fingerprint sensor
C) Voice recognition
D) Password typing
40. The main goal of Computer Vision is:
A) To enable machines to see, process, and understand images like
humans
B) To write emails faster
C) To play games
D) To talk like humans
Answer Key
1–10: Basics of Computer Vision
1. A) Teaching machines to see and understand images
2. A) Face recognition in mobile phones
3. B) Vision
4. A) Computer Vision
5. A) Optical Character Recognition
6. A) Camera
7. A) Rows and columns of pixels
8. A) One intensity value
9. A) Red, Green, Blue
10. A) Binary Image
11–20: Image Types and Features
11. A) Grayscale
12. A) 3
13. A) Detecting diseases in X-ray images
14. A) Computer Vision
15. A) Automatic Number Plate Recognition (ANPR)
16. D) Listening to songs
17. A) Find boundaries in images
18. A) OpenCV
19. A) Remove noise
20. A) Image processing and deep learning
21–30: Deep Learning & Vision
21. A) Convolutional Neural Network
22. A) Image recognition
23. A) Detect features in an image
24. A) Reducing size of feature maps
25. B) Speech-to-Text conversion
26. A) Identifying and locating objects in an image
27. A) Assigning a label to an image
28. A) Google Lens
29. A) Manufacturing
30. A) Detecting tumors in scans
31–40: Advanced Applications & Ethics
31. A) Tesla
32. A) Facebook
33. A) Understand hand and body movements
34. A) Poor image quality
35. A) Computer Vision & Pattern Recognition
36. D) Cooking recipes
37. A) Tracking ball movement in cricket/football
38. A) Overlaying digital images on real world
39. A) Face unlock in smartphones
40. A) To enable machines to see, process, and understand images
like humans
II.DESCRIPTIVE ANSWERS:
1.Explain the five stages of the Computer Vision process with
examples.
Answer:
The Computer Vision (CV) process allows machines to “see” and interpret
visual data such as images and videos. It generally follows five main
stages:
1. Image Acquisition
First step: capturing or obtaining a digital image or video.
Sources: digital cameras, scanners, medical devices (like MRI, CT
scans), or design software.
Quality depends on camera resolution, lighting, and angle.
Example: A CCTV camera captures images of people entering a
building.
2. Preprocessing
Raw images may contain noise, distortions, or uneven brightness.
Preprocessing prepares them for analysis.
Common techniques:
o Noise Reduction: Removes unwanted spots or blur.
o Normalization: Adjusts pixel values to a fixed range.
o Resizing/Cropping: Makes all images the same size.
o Histogram Equalization: Improves brightness and contrast.
Example: Cleaning up a blurry passport photo for verification.
3. Feature Extraction
This step identifies important patterns or attributes from the
image.
Techniques include:
o Edge Detection: Finds boundaries of objects.
o Corner Detection: Finds points where edges meet.
o Texture Analysis: Recognizes smoothness, roughness, or
repeated patterns.
o Colour-based Features: Helps in distinguishing objects by
colour.
Example: A smartphone camera detecting a person’s face by locating
eyes, nose, and mouth edges.
4. Detection/Segmentation
Detects objects or regions of interest in the image.
Two types:
1. Classification: Identifies what the object is (e.g., “This is a
dog”).
2. Classification + Localization: Identifies the object and its
position using a bounding box.
3. Object Detection: Finds and labels multiple objects in one
image.
4. Image Segmentation: Separates pixels into regions for better
understanding.
Semantic Segmentation: Groups similar pixels into one
class (e.g., all trees).
Instance Segmentation: Differentiates between
individual objects of the same class (e.g., three separate
dogs).
Example: In a self-driving car, segmentation helps identify lanes,
pedestrians, and traffic lights.
5. High-Level Processing
Final stage: interpreting results to make decisions.
The system recognizes objects, understands context, and analyses
scenes.
Example: In healthcare, after detecting a tumour in a scan, the
system suggests whether it is likely cancerous.
Q2. What are the Applications of Computer Vision in real life?
Explain with examples.
Answer:
Computer Vision (CV) is one of the most important fields of Artificial
Intelligence. It helps machines not only to see images but also to analyze,
understand, and take decisions based on visual data. Today, Computer
Vision is applied in almost every sector. Some major applications are:
1. Healthcare and Medical Imaging
CV helps in detecting diseases early through medical scans like X-
rays, MRI, and CT scans.
It can identify tumors, fractures, or infections that doctors may miss.
Example: AI-based systems detect early signs of cancer from
mammogram images.
2. Self-Driving Cars
Autonomous vehicles use computer vision to see and interpret the
road.
CV detects pedestrians, other vehicles, road signs, and traffic lights.
It helps the car in lane detection, obstacle avoidance, and safe
navigation.
Example: Tesla’s Autopilot uses cameras and CV algorithms for
decision-making.
3. Surveillance and Security
CV is widely used in CCTV cameras and monitoring systems.
It can recognize faces, detect suspicious behavior, and even track
objects in real-time.
Example: Airports use face recognition systems for passenger
verification.
4. Retail and E-commerce
CV helps in automated checkout systems, stock management, and
customer behavior analysis.
Online shopping platforms use CV for visual search – a customer can
upload a photo to find similar products.
Example: Amazon Go stores use CV to allow customers to shop without
cashiers.
5. Agriculture
Farmers use CV to monitor crop health, detect pests, and analyze
soil conditions.
Drones with cameras capture images of fields, and CV algorithms
detect problems.
Example: CV detects whether leaves of crops show disease symptoms.
6. Entertainment and Augmented Reality (AR)
CV is used in video games, movies, and filters on social media
apps.
AR applications overlay digital images on real-world objects using CV.
Example: Snapchat and Instagram filters use facial landmark detection.
7. Industrial Automation
In factories, CV helps in quality inspection and ensures products
meet standards.
Robots use CV to identify, pick, and assemble objects on production
lines.
Example: Detecting defective chips in an electronics factory.