0% found this document useful (0 votes)
25 views38 pages

Worksheet Class 12 Ai

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views38 pages

Worksheet Class 12 Ai

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 38

UNIT 1: PYTHON PROGRAMMING-II

1. Which of these represents the correct way to import NumPy?

A) import numpy
B) import numpy as np
C) import np as numpy
D) from numpy import *

2. What does NumPy primarily provide in Python?

A) Data visualization tools


B) Multi-dimensional arrays and high-speed operations
C) Natural Language Processing features
D) Web development framework

3. What is the main advantage of NumPy arrays over Python lists for
numerical operations?

A) They support string manipulation


B) They operate faster due to optimized memory and vectorized operations
C) They can hold multiple data types in the same array
D) They are immutable

4. What is the output shape of np.array([[1,2,3],[4,5,6]])?

A) (3,2)
B) (2,3)
C) (6,)
D) (2,2)

5. Which operation in NumPy applies element-wise?

A) Matrix inversion
B) Matrix multiplication only
C) Element-wise addition and multiplication using arrays
D) None of the above

6. Which method converts a Python list into a NumPy array?

A) np.list()
B) np.convert()
C) np.array()
D) np.make_array()

7. In NumPy, what is arr.dtype used for?


A) To get the number of dimensions
B) To know the data type of elements in the array
C) To reshape an array
D) To get array length

8. What does arr.shape return in NumPy?

A) The datatype
B) Size of the array
C) The dimensions of the array as a tuple
D) Total number of elements

9. What is the output of np.zeros((2,3))?

A) Array of ones with shape (2,3)


B) Array of zeros with shape (2,3)
C) Array of random numbers
D) Error

10. To get a flattened version of an array arr, which method is


appropriate?

A) arr.flatten()
B) arr.reshape()
C) arr.flat()
D) arr.expand()

11–20: Pandas Foundations

11. What is a Series in Pandas?

A) A 2-dimensional labeled data structure


B) A 1-dimensional labeled array
C) A function in Pandas
D) An unordered collection

12. What is a DataFrame in Pandas?

A) A 1-dimensional labeled array


B) A 2-dimensional labeled data structure with columns
C) A visualization object
D) A type of Python dictionary

13. How do you create a Pandas Series from a list named lst?
A) pd.Series(lst)
B) pd.DataFrame(lst)
C) pd.Array(lst)
D) pd.Series.create(lst)

14. To import a CSV file into a DataFrame named df?

A) pd.load_csv('file.csv')
B) pd.read_csv('file.csv')
C) df.read_csv('file.csv')
D) np.read('file.csv')

15. To export DataFrame df to a CSV named out.csv?

A) df.to_csv('out.csv')
B) pd.to_csv(df, 'out.csv')
C) df.save('out.csv')
D) df.write_csv()

16. What does df.head() do?

A) Shows last 5 rows


B) Shows first 5 rows
C) Returns column names
D) Displays summary statistics

17. To view DataFrame summary stats like mean and std dev?

A) df.summary()
B) df.info()
C) df.describe()
D) df.stats()

18. To rename a column from 'old' to 'new' in df?

A) df.rename(columns={'old':'new'}, inplace=True)
B) df.change('old','new')
C) df.columns = 'new'
D) df.rename_column(old=new)

19. Which method helps read only specific columns from a CSV?

A) usecols= parameter in read_csv


B) df.select_cols()
C) df.columns(['col1','col2'])
D) None of the above
20. Setting a specific column as index in a CSV import?

A) pd.read_csv('f.csv', index='col')
B) pd.read_csv('f.csv', index_col='col')
C) df.set_index('col')
D) Correct only: B is accurate

21–30: Handling Missing Data & DataFrame Operations

21. Which Pandas function identifies missing values?

A) df.missing()
B) df.isnull()
C) df.nan()
D) df.null()

22. To drop rows with missing values?

A) df.dropna()
B) df.remove_na()
C) df.na.drop()
D) df.clean()

23. To fill missing values with zero?

A) df.fillna(0)
B) df.na(0)
C) df.replace(None, 0)
D) df.zero()

24. How to fill missing values in column 'age' with mean?

A) df['age'].fillna(df['age'].mean(), inplace=True)
B) df.fillna('age', mean)
C) df['age'].mean_fill()
D) df.mean('age').fill()

25. How to drop a column named 'col'?

A) df.drop(columns=['col'])
B) df.remove('col')
C) df.del('col')
D) del df['col'] (also valid in pandas)

26. Add a new column 'new' calculated from 'col1' + col2'?


A) df['new'] = df['col1'] + df['col2']
B) df.new = df.col1 + df.col2
C) df.add_column('new', df['col1'] + df['col2'])
D) Only A is correct

27. To sort DataFrame df by a column 'score' descending?

A) df.sort('score', ascending=False)
B) df.sort_values(by='score', ascending=False)
C) df.order_by('score', desc=True)
D) df.sort_desc('score')

28. How to reset an index of DataFrame?

A) df.reset_index(drop=True)
B) df.reset()
C) df.set_index(None)
D) df.reindex()

29. Which method returns data types of each column?

A) df.info()
B) df.types()
C) df.dtypes
D) Both A and C work (df.dtypes shows column types; info() includes types)

30. Conversion of a column 'col' to integer type?

A) df['col'].astype(int)
B) df['col'].convert(int)
C) df.astype('col', int)
D) df.typecast('col', int)

31–40: Linear Regression (Advanced Learners)

31. What is the simplest form of regression in ML?

A) Logistic regression
B) Linear regression
C) Polynomial regression
D) None of these

32. In linear regression with 1 feature, the equation y = a x + b


denotes:
A) y is feature, x is target
B) x is predictor, y is response; a = slope, b = intercept
C) a is intercept, b is slope
D) None of these

33. Which Python library can implement linear regression?

A) NumPy
B) Pandas
C) Scikit-learn
D) Matplotlib

34. How to split data into training and testing sets?

A) Manual slicing
B) train_test_split from Scikit-learn
C) Pandas df.split()
D) NumPy np.split()

35. In Scikit-learn, which class is used for linear regression?

A) LinearModel
B) LinearRegression
C) Regression
D) LinearFit

36. After training a linear regression model model, how to make


predictions on X_test?

A) model.predict(X_test)
B) model.predict()
C) model.predict(X_train)
D) predict(model, X_test)

37. What metric calculates average squared error between actual


and predicted?

A) Mean Squared Error (MSE)


B) Root Mean Squared Error (RMSE)
C) Both A and B
D) Mean Absolute Error (MAE)

38. Which gives error in same units as target variable?

A) MSE
B) RMSE
C) Both
D) None

39. Which library provides functions for MSE and RMSE?

A) NumPy
B) Pandas
C) Scikit-learn (mean_squared_error)
D) Matplotlib

40. Why use a train-test split in regression?

A) To test model on unseen data and judge performance


B) To increase accuracy artificially
C) To reduce dataset size
D) It's optional and not helpful

ANSWER KEY:

1–10: NumPy Basics and Arrays

1. B) import numpy as np
2. B) Multi-dimensional arrays and high-speed operations
3. B) They operate faster due to optimized memory and vectorized
operations
4. B) (2,3)
5. C) Element-wise addition and multiplication using arrays
6. C) np.array()
7. B) To know the data type of elements in the array
8. C) The dimensions of the array as a tuple
9. B) Array of zeros with shape (2,3)
10. A) arr.flatten()

11–20: Pandas Foundations

11. B) A 1-dimensional labeled array

12. B) A 2-dimensional labeled data structure with columns

13. A) pd.Series(lst)
14. B) pd.read_csv('file.csv')

15. A) df.to_csv('out.csv')

16. B) Shows first 5 rows

17. C) df.describe()

18. A) df.rename(columns={'old':'new'}, inplace=True)

19. A) usecols= parameter in read_csv

20. B) pd.read_csv('f.csv', index_col='col')

21–30: Handling Missing Data & DataFrame Operations

21. B) df.isnull()

22. A) df.dropna()

23. A) df.fillna(0)

24. A) df['age'].fillna(df['age'].mean(), inplace=True)

25. A) df.drop(columns=['col']) (D also works but A is standard)

26. A) df['new'] = df['col1'] + df['col2']

27. B) df.sort_values(by='score', ascending=False)

28. A) df.reset_index(drop=True)

29. D) Both A and C work (df.dtypes shows column types; info()


includes types)

30. A) df['col'].astype(int)

31–40: Linear Regression (Advanced Learners)

31. B) Linear regression

32. B) x is predictor, y is response; a = slope, b = intercept

33. C) Scikit-learn

34. B) train_test_split from Scikit-learn

35. B) LinearRegression
36. A) model.predict(X_test)

37. A) Mean Squared Error (MSE)

38. B) RMSE

39. C) Scikit-learn (mean_squared_error)

40. A) To test model on unseen data and judge performance

DESCRIPTIVE QUESTIONS AND ANSWERS:

1. Create DataFrame from List of Lists

import pandas as pd

data = [

[101, "Amit", 87],

[102, "Riya", 92],

[103, "Kiran", 78]

df = pd.DataFrame(data, columns=["RollNo", "Name", "Marks"])

print("DataFrame created from list of lists:\n", df)

2. Create DataFrame from Dictionary

import pandas as pd

data = {

"RollNo": [101, 102, 103],

"Name": ["Amit", "Riya", "Kiran"],

"Marks": [87, 92, 78]


}

df = pd.DataFrame(data)

print("DataFrame created from dictionary:\n", df)

3. Access Rows and Columns

import pandas as pd

df = pd.DataFrame({

"RollNo": [101, 102, 103],

"Name": ["Amit", "Riya", "Kiran"],

"Marks": [87, 92, 78]

})

# Access column

print("Marks column:\n", df["Marks"])

# Access row by index

print("Second row:\n", df.iloc[1])

4. Boolean Indexing

import pandas as pd

df = pd.DataFrame({

"Name": ["Amit", "Riya", "Kiran", "Neha"],

"Marks": [87, 92, 78, 85]

})
# Students with Marks > 80

high_marks = df[df["Marks"] > 80]

print("Students with Marks > 80:\n", high_marks)

5. Filter Multiple Conditions

import pandas as pd

df = pd.DataFrame({

"Name": ["Amit", "Riya", "Kiran", "Neha"],

"Marks": [87, 92, 78, 85],

"Age": [16, 17, 16, 17]

})

# Students with Marks>80 AND Age=17

filtered = df[(df["Marks"] > 80) & (df["Age"] == 17)]

print("Filtered DataFrame:\n", filtered)

6. Add New Column Based on Calculation

import pandas as pd

df = pd.DataFrame({

"Name": ["Amit", "Riya", "Kiran"],

"Marks1": [87, 92, 78],

"Marks2": [90, 88, 80]

})
# Add total marks column

df["Total"] = df["Marks1"] + df["Marks2"]

print("DataFrame with Total Marks:\n", df)

7. Import CSV and Display

import pandas as pd

# Assuming 'student.csv' has columns RollNo, Name, Marks

df = pd.read_csv("student.csv")

print("CSV DataFrame:\n", df)

8. Export DataFrame to CSV

import pandas as pd

df = pd.DataFrame({

"Name": ["Amit", "Riya", "Kiran"],

"Marks": [87, 92, 78]

})

df.to_csv("output.csv", index=False)

print("DataFrame exported to 'output.csv'")

9. Access Specific Rows and Columns

import pandas as pd

df = pd.DataFrame({

"RollNo": [101, 102, 103, 104],


"Name": ["Amit", "Riya", "Kiran", "Neha"],

"Marks": [87, 92, 78, 85]

})

# Access first two rows

print("First two rows:\n", df.head(2))

# Access specific columns

print("Name and Marks columns:\n", df[["Name", "Marks"]])

10.Create DataFrame from List of Dictionaries

import pandas as pd

data = [

{"RollNo": 101, "Name": "Amit", "Marks": 87},

{"RollNo": 102, "Name": "Riya", "Marks": 92},

{"RollNo": 103, "Name": "Kiran", "Marks": 78}

df = pd.DataFrame(data)

print("DataFrame from list of dictionaries:\n", df)

11. Append/Concatenate DataFrames

import pandas as pd

df1 = pd.DataFrame({

"Name": ["Amit", "Riya"],


"Marks": [87, 92]

})

df2 = pd.DataFrame({

"Name": ["Kiran", "Neha"],

"Marks": [78, 85]

})

# Append df2 to df1

df_appended = df1.append(df2, ignore_index=True)

print("Appended DataFrame:\n", df_appended)

12.Series Arithmetic Operations

import pandas as pd

s1 = pd.Series([10, 20, 30])

s2 = pd.Series([1, 2, 3])

print("Addition:", s1 + s2)

print("Subtraction:", s1 - s2)

print("Multiplication:", s1 * s2)

print("Division:", s1 / s2)

13. Access Rows Using iloc and loc

import pandas as pd

df = pd.DataFrame({

"RollNo": [101, 102, 103],


"Name": ["Amit", "Riya", "Kiran"],

"Marks": [87, 92, 78]

})

# Access row by position

print("First row using iloc:\n", df.iloc[0])

# Access row by index label (if set index)

df.set_index("RollNo", inplace=True)

print("Row with RollNo 102 using loc:\n", df.loc[102])

14. Boolean Indexing with Multiple Conditions

import pandas as pd

df = pd.DataFrame({

"Name": ["Amit", "Riya", "Kiran", "Neha"],

"Marks": [87, 92, 78, 85],

"Age": [16, 17, 16, 17]


})

# Marks > 80 AND Age=17


result = df[(df["Marks"] > 80) & (df["Age"] == 17)]
print("Filtered DataFrame:\n", result)

15. Adding a Calculated Column


import pandas as pd

df = pd.DataFrame({
"Name": ["Amit", "Riya", "Kiran"],
"Marks1": [87, 92, 78],
"Marks2": [90, 88, 80]
})

# Average marks column


df["Average"] = (df["Marks1"] + df["Marks2"]) / 2
print("DataFrame with Average:\n", df)
16. Access Specific Rows & Columns
import pandas as pd

df = pd.DataFrame({
"RollNo": [101, 102, 103, 104],
"Name": ["Amit", "Riya", "Kiran", "Neha"],
"Marks": [87, 92, 78, 85]
})

# Access first 2 rows


print("First 2 rows:\n", df.head(2))

# Access Name and Marks columns


print("Name and Marks:\n", df[["Name", "Marks"]])

UNIT 2: DATA SCIENCE METHODOLOGY: AN ANALYTIC APPROACH TO


CAPSTONE PROJECT

1. The first step in a Data Science project is:


A) Data Cleaning
B) Modeling
C) Problem Definition
D) Evaluation

2. A Data Science project is also called a:


A) Capstone Project
B) Programming task
C) Spreadsheet task
D) Web project

3. Which step comes after Problem Definition?


A) Analytic Approach
B) Deployment
C) Data Cleaning
D) Feedback

4. Data Science Methodology helps us to:


A) Play games
B) Follow a clear process to solve problems
C) Learn drawing
D) Make speeches

5. In the Analytic Approach stage, we decide:


A) Which game to play
B) Which method or model to use
C) Which teacher to ask
D) Which exam to write

6. Predicting marks of students is an example of:


A) Regression
B) Classification
C) Clustering
D) Random guessing

7. Identifying whether a mail is spam or not is an example of:


A) Regression
B) Classification
C) Clustering
D) Data cleaning

8. Grouping customers based on purchase habits is:


A) Classification
B) Regression
C) Clustering
D) Evaluation

9. A well-defined problem should be:


A) Clear and measurable
B) Vague
C) Random
D) Confusing

10. Success criteria means:


A) How we will measure success of our project
B) How much data we have
C) What coding language we use
D) What chart we draw

11–20: Data Requirements, Collection & Understanding

11. Data requirement depends on:


A) Problem statement
B) Favorite subject
C) Mobile app used
D) Random choice

12. Tables in databases are examples of:


A) Structured data
B) Unstructured data
C) Videos
D) Images

13. Social media posts are:


A) Structured data
B) Unstructured data
C) Tabular data
D) Numeric data

14. Collecting data by survey is called:


A) Primary data collection
B) Secondary data collection
C) Tertiary data collection
D) Optional

15. Using already published government census is:


A) Primary data
B) Secondary data
C) Experimental data
D) Random data

16. Checking whether data is complete and correct is:


A) Data Understanding
B) Data Cleaning
C) Deployment
D) Coding
17. Missing values, duplicates, and outliers are:
A) Data quality issues
B) Data visualizations
C) Problem statements
D) Success criteria

18. Calculating mean, median, and mode comes under:


A) Data Understanding
B) Deployment
C) Problem Definition
D) Feedback

19. Data visualization helps to:


A) Understand data better
B) Waste time
C) Confuse students
D) Avoid coding

20. Pie charts and bar graphs are used in:


A) Data Visualization
B) Model Training
C) Problem Definition
D) Deployment

21–30: Data Preparation & Modeling

21. Removing duplicates is part of:


A) Data Preparation
B) Problem Definition
C) Deployment
D) Feedback

22. Filling missing values is part of:


A) Data Preparation
B) Deployment
C) Problem Definition
D) Evaluation

23. Converting text labels into numbers is called:


A) Encoding
B) Cleaning
C) Plotting
D) Deployment

24. Splitting data into training and testing is done in:


A) Data Preparation
B) Deployment
C) Problem Definition
D) Feedback

25. The variable we want to predict is called:


A) Target
B) Feature
C) Input
D) Record

26. The input variables used to predict are called:


A) Features
B) Targets
C) Labels
D) Models

27. In supervised learning, we have:


A) Input and output labels
B) Only inputs
C) Only outputs
D) Random data

28. In clustering, data is grouped:


A) Without labels
B) With labels
C) With answers given
D) With teacher’s notes

29. Predicting rainfall in cm is an example of:


A) Regression
B) Classification
C) Clustering
D) Deployment

30. Predicting whether a student passes or fails is an example of:


A) Classification
B) Regression
C) Clustering
D) Data Cleaning

31–40: Evaluation, Deployment & Feedback

31. Accuracy is used for:


A) Classification models
B) Regression models
C) Clustering models
D) Data visualization

32. Mean Squared Error is used for:


A) Regression models
B) Classification models
C) Clustering models
D) None

33. Precision and Recall are for:


A) Classification problems
B) Regression problems
C) Data cleaning
D) Deployment
34. Cross-validation is used for:
A) Checking model performance
B) Collecting data
C) Drawing graphs
D) Defining problem
35. ROC curve is used in:
A) Classification models
B) Regression models
C) Clustering models
D) Data preparation
36. In deployment stage, the model is:
A) Used in real world
B) Deleted
C) Tested again
D) Ignored
37. Feedback stage helps in:
A) Improving the model
B) Stopping the project
C) Avoiding evaluation
D) Deleting data
38. Data Science project is considered complete when:
A) Model is deployed and working well
B) Data is collected
C) Problem is written
D) Graph is drawn
39. The last step in the Data Science Methodology is:
A) Feedback and Iteration
B) Modeling
C) Data Collection
D) Evaluation
40. A Capstone Project means:
A) A complete project applying all steps of Data Science Methodology
B) A single coding exercise
C) A survey form
D) A group discussion

ANSWER KEY:

1–10: Problem Definition & Analytic Approach

1. C) Problem Definition

2. A) Capstone Project

3. A) Analytic Approach

4. B) Follow a clear process to solve problems

5. B) Which method or model to use

6. A) Regression

7. B) Classification

8. C) Clustering

9. A) Clear and measurable

10. A) How we will measure success of our project

11–20: Data Requirements, Collection & Understanding


11. A) Problem statement

12. A) Structured data

13. B) Unstructured data

14. A) Primary data collection

15. B) Secondary data

16. A) Data Understanding

17. A) Data quality issues

18. A) Data Understanding

19. A) Understand data better

20. A) Data Visualization

21–30: Data Preparation & Modeling

21. A) Data Preparation

22. A) Data Preparation

23. A) Encoding

24. A) Data Preparation

25. A) Target

26. A) Features

27. A) Input and output labels

28. A) Without labels

29. A) Regression

30. A) Classification

31–40: Evaluation, Deployment & Feedback

31. A) Classification models

32. A) Regression models

33. A) Classification problems


34. A) Checking model performance

35. A) Classification models

36. A) Used in real world

37. A) Improving the model

38. A) Model is deployed and working well

39. A) Feedback and Iteration

40. A) A complete project applying all steps of Data Science


Methodology

DESCRIPTIVE QUESTIONS:

1. What are the main steps in the Data Science Methodology?


Explain briefly Main Steps of the Data Science Methodology
(Explained)

Big picture: It’s a repeatable, iterative cycle that moves from a real-world
problem to a working solution, and then improves it with feedback.
1) Business Understanding (Problem Scoping)

Key question: What problem are we solving and why?


What we do: Talk to stakeholders, use 5W1H and Design Thinking, define
goals, scope, success criteria, constraints.
Deliverable: Clear problem statement + success measure.
Mini example: A school wants to identify students who may need extra
help before finals (goal: reduce failures by 10%).

2) Analytic Approach

Key question: How can data answer this problem?


What we do: Map the problem to an analysis type—classification,
regression, clustering, anomaly detection, recommendation; choose
descriptive/diagnostic/predictive/prescriptive analytics.
Deliverable: Chosen approach (e.g., classification).
Mini example: “Pass/Fail” is classification.

3) Data Requirements

Key question: What data do we need?


What we do: Decide data types (numbers/text/images), format
(tables/CSV), granularity, time range, quality, ethics/privacy.
Deliverable: Data requirement/specification document.
Mini example: Needed features: attendance %, internal test scores,
homework completion, study hours.

4) Data Collection

Key question: Where will we get it from?


What we do: Gather data from primary (surveys, forms, sensors) and
secondary (databases, open data) sources.
Deliverable: Raw dataset + data dictionary.
Mini example: Pull records from the school MIS; add a short student survey
for study hours.

5) Data Understanding (Exploration)

Key question: Does the data represent the problem correctly?


What we do: Explore with statistics and visuals; check distributions,
correlations, missing values, outliers, data leakage.
Deliverable: EDA (Exploratory Data Analysis) summary with findings/issues.
Mini example: Notice many missing study hours for Grade 12;
attendance strongly correlates with outcomes.

6) Data Preparation (Cleaning & Feature Engineering)

Key question: How do we make data model-ready?


What we do:

 Clean: fix typos, handle missing values, remove duplicates.

 Transform: encode categories, scale numbers, derive new features.

 Integrate: merge tables; split into train/validation/test.


Deliverable: Tidy, modelling-ready dataset. (Most time-consuming
step!)
Mini example: Create “term average”, “consistency score”;
impute missing study hours; one-hot encode streams (Sci/Comm/Arts).

7) Modelling

Key question: Which algorithm fits best?


What we do: Train candidate models (e.g., Decision Tree, Logistic
Regression), tune hyperparameters, use cross-validation.
Deliverable: One or more candidate models with training results.
Mini example: Train a Decision Tree to classify Pass/Fail.

8) Evaluation

Key question: Does the model answer the original question well?
What we do: Test on unseen data; use metrics (Accuracy, Precision, Recall,
F1 for classification; MAE/MSE/RMSE for regression). Check against business
success criteria (from Step 1).
Deliverable: Evaluation report + decision (proceed/adjust).
Mini example: Model gets F1 = 0.86; meets target (≥0.80). If not, revisit
features or approach.

9) Deployment

Key question: How does the user get the solution?


What we do: Put the model into use—web app, mobile app, dashboard, API;
decide batch vs real-time; add basic monitoring.
Deliverable: Working solution in the real environment.
Mini example: A simple teacher dashboard flags students needing
support each week.

10) Feedback (Monitoring & Improvement)

Key question: Is the problem solved? What can we improve?


What we do: Gather user feedback, track performance over time, detect
drift, retrain periodically, refine features and rules.
Deliverable: Updated model/process; continuous improvement loop.
Mini example: Teachers report some false alarms after vacations → adjust
the model to consider holiday weeks.

Q2. Explain the importance of Data Understanding and Data


Preparation in the Data Science Methodology with examples.

Answer:

After collecting data, it cannot be directly used for analysis because raw data
is often incomplete, inconsistent, or contains errors. Two important steps at
this stage are Data Understanding and Data Preparation.

1. Data Understanding

o In this step, the data collected is carefully studied.


o The aim is to know what kind of data is available, its format, size,
and quality.
o Example: If a school collects student marks data, we check how
many subjects are there, whether marks are stored as numbers,
and if any values are missing.
o This step helps in deciding whether the data is sufficient for
solving the problem.
2. Data Preparation
o Also called data cleaning or data preprocessing.
o It involves correcting errors, filling missing values, removing
duplicates, and converting data into a useful format.
o Example: In the student marks data, if some records have
“Absent” instead of marks, it must be handled properly (e.g.,
replaced with 0 or marked as missing).
o Without preparation, the model may give wrong or misleading
results.
3. Why these steps are important
o Good quality data leads to reliable results.
o Poor or unclean data leads to wrong conclusions, which can
affect decision-making.
o These steps ensure that the data is accurate, consistent, and
suitable for building models.
👉 Thus, Data Understanding and Preparation are the backbone of the
Data Science process, because “garbage in, garbage out” – if we feed
wrong data, we will get wrong results.

UNIT 3: MAKING MACHINES SEE

1–10: Basics of Computer Vision

1. What does "Computer Vision" mean?


A) Teaching machines to see and understand images
B) Teaching machines to write code
C) Teaching machines to listen to music
D) Teaching machines to play games

2. Which of the following is an example of Computer Vision?


A) Face recognition in mobile phones
B) Listening to songs
C) Reading novels
D) Sending emails

3. Which sense of humans does Computer Vision try to imitate?


A) Hearing
B) Vision
C) Touch
D) Smell

4. Which technology is used in self-driving cars to detect obstacles?


A) Computer Vision
B) Natural Language Processing
C) Cloud Computing
D) Virtual Reality

5. OCR stands for:


A) Optical Character Recognition
B) Object Character Recognition
C) Optical Camera Reading
D) Open Character Reading

6. Which device is commonly used to capture images for computer


vision?
A) Camera
B) Microphone
C) Keyboard
D) Speaker

7. In computer vision, an image is represented as:


A) Rows and columns of pixels
B) Blocks of sound waves
C) Pages of text
D) Series of commands

8. Each pixel in a grayscale image contains:


A) One intensity value
B) Three values (RGB)
C) Only 0 or 1
D) A text label

9. What do RGB stand for in images?


A) Red, Green, Blue
B) Read, Go, Black
C) Random, General, Binary
D) Range, Grid, Brightness

10. Which type of image uses only two colors: black and white?
A) Binary Image
B) Grayscale Image
C) RGB Image
D) Color Image

11–20: Image Types and Features

11. Which type of image has values between 0 to 255 for intensity?
A) Grayscale
B) Binary
C) RGB
D) None
12. A color image (RGB) has how many channels?
A) 3
B) 2
C) 1
D) 4

13. Which of the following is an application of Computer Vision?


A) Detecting diseases in X-ray images
B) Listening to music
C) Writing stories
D) Sending SMS

14. Facial recognition system in airports is an example of:


A) Computer Vision
B) Speech Recognition
C) Data Science
D) Robotics only

15. Number plate detection in traffic cameras is called:


A) Automatic Number Plate Recognition (ANPR)
B) Automatic Name Plate Reading
C) Auto Numeric Processing
D) Artificial Numeric Prediction

16. Which of these is NOT a step in computer vision?


A) Capturing image
B) Processing image
C) Extracting features
D) Listening to songs

17. Edge detection is used to:


A) Find boundaries in images
B) Increase sound quality
C) Write texts faster
D) Store data

18. Which tool/library is widely used in Python for Computer Vision?


A) OpenCV
B) NumPy only
C) MS Word
D) PowerPoint
19. In image processing, filtering is used to:
A) Remove noise
B) Write documents
C) Send emails
D) Play music

20. Convolution is a process mainly used in:


A) Image processing and deep learning
B) Emailing
C) Gaming only
D) Text editing

21–30: Deep Learning & Vision

21. CNN stands for:


A) Convolutional Neural Network
B) Central Neural Node
C) Computer Numeric Network
D) Common Neural Net

22. CNN is widely used for:


A) Image recognition
B) Music composition
C) Text writing
D) Weather reporting

23. In CNN, the convolution layer is used to:


A) Detect features in an image
B) Play games
C) Send messages
D) Translate languages

24. Pooling layer in CNN is used for:


A) Reducing size of feature maps
B) Sending emails
C) Increasing volume
D) Writing code

25. Which of these is NOT an application of Computer Vision?


A) Object Detection
B) Speech-to-Text conversion
C) Face Detection
D) Medical Imaging

26. Object detection means:


A) Identifying and locating objects in an image
B) Counting numbers
C) Detecting sound
D) Writing notes

27. Image classification means:


A) Assigning a label to an image
B) Assigning sound to an image
C) Breaking the image into small pieces
D) Removing pixels

28. Which of these is an everyday use of Computer Vision?


A) Google Lens
B) MS Paint
C) Notepad
D) Excel only

29. Which industry uses computer vision for quality control?


A) Manufacturing
B) Music
C) Literature
D) Tourism

30. In healthcare, computer vision can help in:


A) Detecting tumors in scans
B) Cooking food
C) Playing sports
D) Teaching languages

31–40: Advanced Applications & Ethics

31. Which company uses Computer Vision in self-driving cars?


A) Tesla
B) WhatsApp
C) Spotify
D) Twitter
32. Which social media app uses computer vision for automatic
photo tagging?
A) Facebook
B) WhatsApp
C) Telegram
D) Signal
33. Gesture recognition allows computers to:
A) Understand hand and body movements
B) Understand smell
C) Understand music
D) Understand novels
34. Which of these is a challenge in computer vision?
A) Poor image quality
B) Bright lighting always
C) Correct grammar
D) High volume
35. CAPTCHA test on websites uses:
A) Computer Vision & Pattern Recognition
B) Music recognition
C) Text summarization
D) Story writing
36. Which of these is NOT a concern in Computer Vision?
A) Privacy issues
B) Data security
C) Image accuracy
D) Cooking recipes
37. Which application uses Computer Vision in sports?
A) Tracking ball movement in cricket/football
B) Reading books aloud
C) Generating poems
D) Playing songs
38. Augmented Reality (AR) uses Computer Vision for:
A) Overlaying digital images on real world
B) Sending text messages
C) Cooking recipes
D) Playing audio
39. Which of the following is an example of biometric system using
Computer Vision?
A) Face unlock in smartphones
B) Fingerprint sensor
C) Voice recognition
D) Password typing
40. The main goal of Computer Vision is:
A) To enable machines to see, process, and understand images like
humans
B) To write emails faster
C) To play games
D) To talk like humans

Answer Key
1–10: Basics of Computer Vision
1. A) Teaching machines to see and understand images
2. A) Face recognition in mobile phones
3. B) Vision
4. A) Computer Vision
5. A) Optical Character Recognition
6. A) Camera
7. A) Rows and columns of pixels
8. A) One intensity value
9. A) Red, Green, Blue
10. A) Binary Image

11–20: Image Types and Features


11. A) Grayscale
12. A) 3
13. A) Detecting diseases in X-ray images
14. A) Computer Vision
15. A) Automatic Number Plate Recognition (ANPR)
16. D) Listening to songs
17. A) Find boundaries in images
18. A) OpenCV
19. A) Remove noise
20. A) Image processing and deep learning

21–30: Deep Learning & Vision


21. A) Convolutional Neural Network
22. A) Image recognition
23. A) Detect features in an image
24. A) Reducing size of feature maps
25. B) Speech-to-Text conversion
26. A) Identifying and locating objects in an image
27. A) Assigning a label to an image
28. A) Google Lens
29. A) Manufacturing
30. A) Detecting tumors in scans

31–40: Advanced Applications & Ethics


31. A) Tesla
32. A) Facebook
33. A) Understand hand and body movements
34. A) Poor image quality
35. A) Computer Vision & Pattern Recognition
36. D) Cooking recipes
37. A) Tracking ball movement in cricket/football
38. A) Overlaying digital images on real world
39. A) Face unlock in smartphones
40. A) To enable machines to see, process, and understand images
like humans

II.DESCRIPTIVE ANSWERS:
1.Explain the five stages of the Computer Vision process with
examples.
Answer:
The Computer Vision (CV) process allows machines to “see” and interpret
visual data such as images and videos. It generally follows five main
stages:

1. Image Acquisition
 First step: capturing or obtaining a digital image or video.
 Sources: digital cameras, scanners, medical devices (like MRI, CT
scans), or design software.
 Quality depends on camera resolution, lighting, and angle.
 Example: A CCTV camera captures images of people entering a
building.

2. Preprocessing
 Raw images may contain noise, distortions, or uneven brightness.
Preprocessing prepares them for analysis.
 Common techniques:
o Noise Reduction: Removes unwanted spots or blur.
o Normalization: Adjusts pixel values to a fixed range.
o Resizing/Cropping: Makes all images the same size.
o Histogram Equalization: Improves brightness and contrast.
 Example: Cleaning up a blurry passport photo for verification.

3. Feature Extraction
 This step identifies important patterns or attributes from the
image.
 Techniques include:
o Edge Detection: Finds boundaries of objects.
o Corner Detection: Finds points where edges meet.
o Texture Analysis: Recognizes smoothness, roughness, or
repeated patterns.
o Colour-based Features: Helps in distinguishing objects by
colour.
 Example: A smartphone camera detecting a person’s face by locating
eyes, nose, and mouth edges.

4. Detection/Segmentation
 Detects objects or regions of interest in the image.
 Two types:
1. Classification: Identifies what the object is (e.g., “This is a
dog”).
2. Classification + Localization: Identifies the object and its
position using a bounding box.
3. Object Detection: Finds and labels multiple objects in one
image.
4. Image Segmentation: Separates pixels into regions for better
understanding.
 Semantic Segmentation: Groups similar pixels into one
class (e.g., all trees).
 Instance Segmentation: Differentiates between
individual objects of the same class (e.g., three separate
dogs).
 Example: In a self-driving car, segmentation helps identify lanes,
pedestrians, and traffic lights.

5. High-Level Processing
 Final stage: interpreting results to make decisions.
 The system recognizes objects, understands context, and analyses
scenes.
 Example: In healthcare, after detecting a tumour in a scan, the
system suggests whether it is likely cancerous.
Q2. What are the Applications of Computer Vision in real life?
Explain with examples.
Answer:
Computer Vision (CV) is one of the most important fields of Artificial
Intelligence. It helps machines not only to see images but also to analyze,
understand, and take decisions based on visual data. Today, Computer
Vision is applied in almost every sector. Some major applications are:

1. Healthcare and Medical Imaging


 CV helps in detecting diseases early through medical scans like X-
rays, MRI, and CT scans.
 It can identify tumors, fractures, or infections that doctors may miss.
 Example: AI-based systems detect early signs of cancer from
mammogram images.

2. Self-Driving Cars
 Autonomous vehicles use computer vision to see and interpret the
road.
 CV detects pedestrians, other vehicles, road signs, and traffic lights.
 It helps the car in lane detection, obstacle avoidance, and safe
navigation.
 Example: Tesla’s Autopilot uses cameras and CV algorithms for
decision-making.

3. Surveillance and Security


 CV is widely used in CCTV cameras and monitoring systems.
 It can recognize faces, detect suspicious behavior, and even track
objects in real-time.
 Example: Airports use face recognition systems for passenger
verification.

4. Retail and E-commerce


 CV helps in automated checkout systems, stock management, and
customer behavior analysis.
 Online shopping platforms use CV for visual search – a customer can
upload a photo to find similar products.
 Example: Amazon Go stores use CV to allow customers to shop without
cashiers.

5. Agriculture
 Farmers use CV to monitor crop health, detect pests, and analyze
soil conditions.
 Drones with cameras capture images of fields, and CV algorithms
detect problems.
 Example: CV detects whether leaves of crops show disease symptoms.

6. Entertainment and Augmented Reality (AR)


 CV is used in video games, movies, and filters on social media
apps.
 AR applications overlay digital images on real-world objects using CV.
 Example: Snapchat and Instagram filters use facial landmark detection.

7. Industrial Automation
 In factories, CV helps in quality inspection and ensures products
meet standards.
 Robots use CV to identify, pick, and assemble objects on production
lines.
 Example: Detecting defective chips in an electronics factory.

You might also like