0% found this document useful (0 votes)

25 views38 pages

Worksheet Class 12 Ai

Uploaded by

vasundharaniranjan2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views38 pages

Worksheet Class 12 Ai

Uploaded by

vasundharaniranjan2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 38

UNIT 1: PYTHON PROGRAMMING-II

1. Which of these represents the correct way to import NumPy?

A) import numpy
B) import numpy as np
C) import np as numpy
D) from numpy import *

2. What does NumPy primarily provide in Python?

A) Data visualization tools

B) Multi-dimensional arrays and high-speed operations
C) Natural Language Processing features
D) Web development framework

3. What is the main advantage of NumPy arrays over Python lists for
numerical operations?

A) They support string manipulation

B) They operate faster due to optimized memory and vectorized operations
C) They can hold multiple data types in the same array
D) They are immutable

4. What is the output shape of np.array([[1,2,3],[4,5,6]])?

A) (3,2)
B) (2,3)
C) (6,)
D) (2,2)

5. Which operation in NumPy applies element-wise?

A) Matrix inversion
B) Matrix multiplication only
C) Element-wise addition and multiplication using arrays
D) None of the above

6. Which method converts a Python list into a NumPy array?

A) np.list()
B) np.convert()
C) np.array()
D) np.make_array()

7. In NumPy, what is arr.dtype used for?

A) To get the number of dimensions
B) To know the data type of elements in the array
C) To reshape an array
D) To get array length

8. What does arr.shape return in NumPy?

A) The datatype
B) Size of the array
C) The dimensions of the array as a tuple
D) Total number of elements

9. What is the output of np.zeros((2,3))?

A) Array of ones with shape (2,3)

B) Array of zeros with shape (2,3)
C) Array of random numbers
D) Error

10. To get a flattened version of an array arr, which method is

appropriate?

A) arr.flatten()
B) arr.reshape()
C) arr.flat()
D) arr.expand()

11–20: Pandas Foundations

11. What is a Series in Pandas?

A) A 2-dimensional labeled data structure

B) A 1-dimensional labeled array
C) A function in Pandas
D) An unordered collection

12. What is a DataFrame in Pandas?

A) A 1-dimensional labeled array

B) A 2-dimensional labeled data structure with columns
C) A visualization object
D) A type of Python dictionary

13. How do you create a Pandas Series from a list named lst?
A) pd.Series(lst)
B) pd.DataFrame(lst)
C) pd.Array(lst)
D) pd.Series.create(lst)

14. To import a CSV file into a DataFrame named df?

A) pd.load_csv('file.csv')
B) pd.read_csv('file.csv')
C) df.read_csv('file.csv')
D) np.read('file.csv')

15. To export DataFrame df to a CSV named out.csv?

A) df.to_csv('out.csv')
B) pd.to_csv(df, 'out.csv')
C) df.save('out.csv')
D) df.write_csv()

16. What does df.head() do?

A) Shows last 5 rows

B) Shows first 5 rows
C) Returns column names
D) Displays summary statistics

17. To view DataFrame summary stats like mean and std dev?

A) df.summary()
B) df.info()
C) df.describe()
D) df.stats()

18. To rename a column from 'old' to 'new' in df?

A) df.rename(columns={'old':'new'}, inplace=True)
B) df.change('old','new')
C) df.columns = 'new'
D) df.rename_column(old=new)

19. Which method helps read only specific columns from a CSV?

A) usecols= parameter in read_csv

B) df.select_cols()
C) df.columns(['col1','col2'])
D) None of the above
20. Setting a specific column as index in a CSV import?

A) pd.read_csv('f.csv', index='col')
B) pd.read_csv('f.csv', index_col='col')
C) df.set_index('col')
D) Correct only: B is accurate

21–30: Handling Missing Data & DataFrame Operations

21. Which Pandas function identifies missing values?

A) df.missing()
B) df.isnull()
C) df.nan()
D) df.null()

22. To drop rows with missing values?

A) df.dropna()
B) df.remove_na()
C) df.na.drop()
D) df.clean()

23. To fill missing values with zero?

A) df.fillna(0)
B) df.na(0)
C) df.replace(None, 0)
D) df.zero()

24. How to fill missing values in column 'age' with mean?

A) df['age'].fillna(df['age'].mean(), inplace=True)
B) df.fillna('age', mean)
C) df['age'].mean_fill()
D) df.mean('age').fill()

25. How to drop a column named 'col'?

A) df.drop(columns=['col'])
B) df.remove('col')
C) df.del('col')
D) del df['col'] (also valid in pandas)

26. Add a new column 'new' calculated from 'col1' + col2'?

A) df['new'] = df['col1'] + df['col2']
B) df.new = df.col1 + df.col2
C) df.add_column('new', df['col1'] + df['col2'])
D) Only A is correct

27. To sort DataFrame df by a column 'score' descending?

A) df.sort('score', ascending=False)
B) df.sort_values(by='score', ascending=False)
C) df.order_by('score', desc=True)
D) df.sort_desc('score')

28. How to reset an index of DataFrame?

A) df.reset_index(drop=True)
B) df.reset()
C) df.set_index(None)
D) df.reindex()

29. Which method returns data types of each column?

A) df.info()
B) df.types()
C) df.dtypes
D) Both A and C work (df.dtypes shows column types; info() includes types)

30. Conversion of a column 'col' to integer type?

A) df['col'].astype(int)
B) df['col'].convert(int)
C) df.astype('col', int)
D) df.typecast('col', int)

31–40: Linear Regression (Advanced Learners)

31. What is the simplest form of regression in ML?

A) Logistic regression
B) Linear regression
C) Polynomial regression
D) None of these

32. In linear regression with 1 feature, the equation y = a x + b

denotes:
A) y is feature, x is target
B) x is predictor, y is response; a = slope, b = intercept
C) a is intercept, b is slope
D) None of these

33. Which Python library can implement linear regression?

A) NumPy
B) Pandas
C) Scikit-learn
D) Matplotlib

34. How to split data into training and testing sets?

A) Manual slicing
B) train_test_split from Scikit-learn
C) Pandas df.split()
D) NumPy np.split()

35. In Scikit-learn, which class is used for linear regression?

A) LinearModel
B) LinearRegression
C) Regression
D) LinearFit

36. After training a linear regression model model, how to make

predictions on X_test?

A) model.predict(X_test)
B) model.predict()
C) model.predict(X_train)
D) predict(model, X_test)

37. What metric calculates average squared error between actual

and predicted?

A) Mean Squared Error (MSE)

B) Root Mean Squared Error (RMSE)
C) Both A and B
D) Mean Absolute Error (MAE)

38. Which gives error in same units as target variable?

A) MSE
B) RMSE
C) Both
D) None

39. Which library provides functions for MSE and RMSE?

A) NumPy
B) Pandas
C) Scikit-learn (mean_squared_error)
D) Matplotlib

40. Why use a train-test split in regression?

A) To test model on unseen data and judge performance

B) To increase accuracy artificially
C) To reduce dataset size
D) It's optional and not helpful

ANSWER KEY:

1–10: NumPy Basics and Arrays

1. B) import numpy as np
2. B) Multi-dimensional arrays and high-speed operations
3. B) They operate faster due to optimized memory and vectorized
operations
4. B) (2,3)
5. C) Element-wise addition and multiplication using arrays
6. C) np.array()
7. B) To know the data type of elements in the array
8. C) The dimensions of the array as a tuple
9. B) Array of zeros with shape (2,3)
10. A) arr.flatten()

11–20: Pandas Foundations

11. B) A 1-dimensional labeled array

12. B) A 2-dimensional labeled data structure with columns

13. A) pd.Series(lst)
14. B) pd.read_csv('file.csv')

15. A) df.to_csv('out.csv')

16. B) Shows first 5 rows

17. C) df.describe()

18. A) df.rename(columns={'old':'new'}, inplace=True)

19. A) usecols= parameter in read_csv

20. B) pd.read_csv('f.csv', index_col='col')

21–30: Handling Missing Data & DataFrame Operations

21. B) df.isnull()

22. A) df.dropna()

23. A) df.fillna(0)

24. A) df['age'].fillna(df['age'].mean(), inplace=True)

25. A) df.drop(columns=['col']) (D also works but A is standard)

26. A) df['new'] = df['col1'] + df['col2']

27. B) df.sort_values(by='score', ascending=False)

28. A) df.reset_index(drop=True)

29. D) Both A and C work (df.dtypes shows column types; info()

includes types)

30. A) df['col'].astype(int)

31–40: Linear Regression (Advanced Learners)

31. B) Linear regression

32. B) x is predictor, y is response; a = slope, b = intercept

33. C) Scikit-learn

34. B) train_test_split from Scikit-learn

35. B) LinearRegression
36. A) model.predict(X_test)

37. A) Mean Squared Error (MSE)

38. B) RMSE

39. C) Scikit-learn (mean_squared_error)

40. A) To test model on unseen data and judge performance

DESCRIPTIVE QUESTIONS AND ANSWERS:

1. Create DataFrame from List of Lists

import pandas as pd

data = [

[101, "Amit", 87],

[102, "Riya", 92],

[103, "Kiran", 78]

df = pd.DataFrame(data, columns=["RollNo", "Name", "Marks"])

print("DataFrame created from list of lists:\n", df)

2. Create DataFrame from Dictionary

import pandas as pd

data = {

"RollNo": [101, 102, 103],

"Name": ["Amit", "Riya", "Kiran"],

"Marks": [87, 92, 78]

}

df = pd.DataFrame(data)

print("DataFrame created from dictionary:\n", df)

3. Access Rows and Columns

import pandas as pd

df = pd.DataFrame({

"RollNo": [101, 102, 103],

"Name": ["Amit", "Riya", "Kiran"],

"Marks": [87, 92, 78]

})

# Access column

print("Marks column:\n", df["Marks"])

# Access row by index

print("Second row:\n", df.iloc[1])

4. Boolean Indexing

import pandas as pd

df = pd.DataFrame({

"Name": ["Amit", "Riya", "Kiran", "Neha"],

"Marks": [87, 92, 78, 85]

})
# Students with Marks > 80

high_marks = df[df["Marks"] > 80]

print("Students with Marks > 80:\n", high_marks)

5. Filter Multiple Conditions

import pandas as pd

df = pd.DataFrame({

"Name": ["Amit", "Riya", "Kiran", "Neha"],

"Marks": [87, 92, 78, 85],

"Age": [16, 17, 16, 17]

})

# Students with Marks>80 AND Age=17

filtered = df[(df["Marks"] > 80) & (df["Age"] == 17)]

print("Filtered DataFrame:\n", filtered)

6. Add New Column Based on Calculation

import pandas as pd

df = pd.DataFrame({

"Name": ["Amit", "Riya", "Kiran"],

"Marks1": [87, 92, 78],

"Marks2": [90, 88, 80]

})
# Add total marks column

df["Total"] = df["Marks1"] + df["Marks2"]

print("DataFrame with Total Marks:\n", df)

7. Import CSV and Display

import pandas as pd

# Assuming 'student.csv' has columns RollNo, Name, Marks

df = pd.read_csv("student.csv")

print("CSV DataFrame:\n", df)

8. Export DataFrame to CSV

import pandas as pd

df = pd.DataFrame({

"Name": ["Amit", "Riya", "Kiran"],

"Marks": [87, 92, 78]

})

df.to_csv("output.csv", index=False)

print("DataFrame exported to 'output.csv'")

9. Access Specific Rows and Columns

import pandas as pd

df = pd.DataFrame({

"RollNo": [101, 102, 103, 104],

"Name": ["Amit", "Riya", "Kiran", "Neha"],

"Marks": [87, 92, 78, 85]

})

# Access first two rows

print("First two rows:\n", df.head(2))

# Access specific columns

print("Name and Marks columns:\n", df[["Name", "Marks"]])

10.Create DataFrame from List of Dictionaries

import pandas as pd

data = [

{"RollNo": 101, "Name": "Amit", "Marks": 87},

{"RollNo": 102, "Name": "Riya", "Marks": 92},

{"RollNo": 103, "Name": "Kiran", "Marks": 78}

df = pd.DataFrame(data)

print("DataFrame from list of dictionaries:\n", df)

11. Append/Concatenate DataFrames

import pandas as pd

df1 = pd.DataFrame({

"Name": ["Amit", "Riya"],

"Marks": [87, 92]

})

df2 = pd.DataFrame({

"Name": ["Kiran", "Neha"],

"Marks": [78, 85]

})

# Append df2 to df1

df_appended = df1.append(df2, ignore_index=True)

print("Appended DataFrame:\n", df_appended)

12.Series Arithmetic Operations

import pandas as pd

s1 = pd.Series([10, 20, 30])

s2 = pd.Series([1, 2, 3])

print("Addition:", s1 + s2)

print("Subtraction:", s1 - s2)

print("Multiplication:", s1 * s2)

print("Division:", s1 / s2)

13. Access Rows Using iloc and loc

import pandas as pd

df = pd.DataFrame({

"RollNo": [101, 102, 103],

"Name": ["Amit", "Riya", "Kiran"],

"Marks": [87, 92, 78]

})

# Access row by position

print("First row using iloc:\n", df.iloc[0])

# Access row by index label (if set index)

df.set_index("RollNo", inplace=True)

print("Row with RollNo 102 using loc:\n", df.loc[102])

14. Boolean Indexing with Multiple Conditions

import pandas as pd

df = pd.DataFrame({

"Name": ["Amit", "Riya", "Kiran", "Neha"],

"Marks": [87, 92, 78, 85],

"Age": [16, 17, 16, 17]

})

# Marks > 80 AND Age=17

result = df[(df["Marks"] > 80) & (df["Age"] == 17)]
print("Filtered DataFrame:\n", result)

15. Adding a Calculated Column

import pandas as pd

df = pd.DataFrame({
"Name": ["Amit", "Riya", "Kiran"],
"Marks1": [87, 92, 78],
"Marks2": [90, 88, 80]
})

# Average marks column

df["Average"] = (df["Marks1"] + df["Marks2"]) / 2
print("DataFrame with Average:\n", df)
16. Access Specific Rows & Columns
import pandas as pd

df = pd.DataFrame({
"RollNo": [101, 102, 103, 104],
"Name": ["Amit", "Riya", "Kiran", "Neha"],
"Marks": [87, 92, 78, 85]
})

# Access first 2 rows

print("First 2 rows:\n", df.head(2))

# Access Name and Marks columns

print("Name and Marks:\n", df[["Name", "Marks"]])

UNIT 2: DATA SCIENCE METHODOLOGY: AN ANALYTIC APPROACH TO

CAPSTONE PROJECT

1. The first step in a Data Science project is:

A) Data Cleaning
B) Modeling
C) Problem Definition
D) Evaluation

2. A Data Science project is also called a:

A) Capstone Project
B) Programming task
C) Spreadsheet task
D) Web project

3. Which step comes after Problem Definition?

A) Analytic Approach
B) Deployment
C) Data Cleaning
D) Feedback

4. Data Science Methodology helps us to:

A) Play games
B) Follow a clear process to solve problems
C) Learn drawing
D) Make speeches

5. In the Analytic Approach stage, we decide:

A) Which game to play
B) Which method or model to use
C) Which teacher to ask
D) Which exam to write

6. Predicting marks of students is an example of:

A) Regression
B) Classification
C) Clustering
D) Random guessing

7. Identifying whether a mail is spam or not is an example of:

A) Regression
B) Classification
C) Clustering
D) Data cleaning

8. Grouping customers based on purchase habits is:

A) Classification
B) Regression
C) Clustering
D) Evaluation

9. A well-defined problem should be:

A) Clear and measurable
B) Vague
C) Random
D) Confusing

10. Success criteria means:

A) How we will measure success of our project
B) How much data we have
C) What coding language we use
D) What chart we draw

11–20: Data Requirements, Collection & Understanding

11. Data requirement depends on:

A) Problem statement
B) Favorite subject
C) Mobile app used
D) Random choice

12. Tables in databases are examples of:

A) Structured data
B) Unstructured data
C) Videos
D) Images

13. Social media posts are:

A) Structured data
B) Unstructured data
C) Tabular data
D) Numeric data

14. Collecting data by survey is called:

A) Primary data collection
B) Secondary data collection
C) Tertiary data collection
D) Optional

15. Using already published government census is:

A) Primary data
B) Secondary data
C) Experimental data
D) Random data

16. Checking whether data is complete and correct is:

A) Data Understanding
B) Data Cleaning
C) Deployment
D) Coding
17. Missing values, duplicates, and outliers are:
A) Data quality issues
B) Data visualizations
C) Problem statements
D) Success criteria

18. Calculating mean, median, and mode comes under:

A) Data Understanding
B) Deployment
C) Problem Definition
D) Feedback

19. Data visualization helps to:

A) Understand data better
B) Waste time
C) Confuse students
D) Avoid coding

20. Pie charts and bar graphs are used in:

A) Data Visualization
B) Model Training
C) Problem Definition
D) Deployment

21–30: Data Preparation & Modeling

21. Removing duplicates is part of:

A) Data Preparation
B) Problem Definition
C) Deployment
D) Feedback

22. Filling missing values is part of:

A) Data Preparation
B) Deployment
C) Problem Definition
D) Evaluation

23. Converting text labels into numbers is called:

A) Encoding
B) Cleaning
C) Plotting
D) Deployment

24. Splitting data into training and testing is done in:

A) Data Preparation
B) Deployment
C) Problem Definition
D) Feedback

25. The variable we want to predict is called:

A) Target
B) Feature
C) Input
D) Record

26. The input variables used to predict are called:

A) Features
B) Targets
C) Labels
D) Models

27. In supervised learning, we have:

A) Input and output labels
B) Only inputs
C) Only outputs
D) Random data

28. In clustering, data is grouped:

A) Without labels
B) With labels
C) With answers given
D) With teacher’s notes

29. Predicting rainfall in cm is an example of:

A) Regression
B) Classification
C) Clustering
D) Deployment

30. Predicting whether a student passes or fails is an example of:

A) Classification
B) Regression
C) Clustering
D) Data Cleaning

31–40: Evaluation, Deployment & Feedback

31. Accuracy is used for:

A) Classification models
B) Regression models
C) Clustering models
D) Data visualization

32. Mean Squared Error is used for:

A) Regression models
B) Classification models
C) Clustering models
D) None

33. Precision and Recall are for:

A) Classification problems
B) Regression problems
C) Data cleaning
D) Deployment
34. Cross-validation is used for:
A) Checking model performance
B) Collecting data
C) Drawing graphs
D) Defining problem
35. ROC curve is used in:
A) Classification models
B) Regression models
C) Clustering models
D) Data preparation
36. In deployment stage, the model is:
A) Used in real world
B) Deleted
C) Tested again
D) Ignored
37. Feedback stage helps in:
A) Improving the model
B) Stopping the project
C) Avoiding evaluation
D) Deleting data
38. Data Science project is considered complete when:
A) Model is deployed and working well
B) Data is collected
C) Problem is written
D) Graph is drawn
39. The last step in the Data Science Methodology is:
A) Feedback and Iteration
B) Modeling
C) Data Collection
D) Evaluation
40. A Capstone Project means:
A) A complete project applying all steps of Data Science Methodology
B) A single coding exercise
C) A survey form
D) A group discussion

ANSWER KEY:

1–10: Problem Definition & Analytic Approach

1. C) Problem Definition

2. A) Capstone Project

3. A) Analytic Approach

4. B) Follow a clear process to solve problems

5. B) Which method or model to use

6. A) Regression

7. B) Classification

8. C) Clustering

9. A) Clear and measurable

10. A) How we will measure success of our project

11–20: Data Requirements, Collection & Understanding

11. A) Problem statement

12. A) Structured data

13. B) Unstructured data

14. A) Primary data collection

15. B) Secondary data

16. A) Data Understanding

17. A) Data quality issues

18. A) Data Understanding

19. A) Understand data better

20. A) Data Visualization

21–30: Data Preparation & Modeling

21. A) Data Preparation

22. A) Data Preparation

23. A) Encoding

24. A) Data Preparation

25. A) Target

26. A) Features

27. A) Input and output labels

28. A) Without labels

29. A) Regression

30. A) Classification

31–40: Evaluation, Deployment & Feedback

31. A) Classification models

32. A) Regression models

33. A) Classification problems

34. A) Checking model performance

35. A) Classification models

36. A) Used in real world

37. A) Improving the model

38. A) Model is deployed and working well

39. A) Feedback and Iteration

40. A) A complete project applying all steps of Data Science

Methodology

DESCRIPTIVE QUESTIONS:

1. What are the main steps in the Data Science Methodology?

Explain briefly Main Steps of the Data Science Methodology
(Explained)

Big picture: It’s a repeatable, iterative cycle that moves from a real-world
problem to a working solution, and then improves it with feedback.
1) Business Understanding (Problem Scoping)

Key question: What problem are we solving and why?

What we do: Talk to stakeholders, use 5W1H and Design Thinking, define
goals, scope, success criteria, constraints.
Deliverable: Clear problem statement + success measure.
Mini example: A school wants to identify students who may need extra
help before finals (goal: reduce failures by 10%).

2) Analytic Approach

Key question: How can data answer this problem?

What we do: Map the problem to an analysis type—classification,
regression, clustering, anomaly detection, recommendation; choose
descriptive/diagnostic/predictive/prescriptive analytics.
Deliverable: Chosen approach (e.g., classification).
Mini example: “Pass/Fail” is classification.

3) Data Requirements

Key question: What data do we need?

What we do: Decide data types (numbers/text/images), format
(tables/CSV), granularity, time range, quality, ethics/privacy.
Deliverable: Data requirement/specification document.
Mini example: Needed features: attendance %, internal test scores,
homework completion, study hours.

4) Data Collection

Key question: Where will we get it from?

What we do: Gather data from primary (surveys, forms, sensors) and
secondary (databases, open data) sources.
Deliverable: Raw dataset + data dictionary.
Mini example: Pull records from the school MIS; add a short student survey
for study hours.

5) Data Understanding (Exploration)

Key question: Does the data represent the problem correctly?

What we do: Explore with statistics and visuals; check distributions,
correlations, missing values, outliers, data leakage.
Deliverable: EDA (Exploratory Data Analysis) summary with findings/issues.
Mini example: Notice many missing study hours for Grade 12;
attendance strongly correlates with outcomes.

6) Data Preparation (Cleaning & Feature Engineering)

Key question: How do we make data model-ready?

What we do:

 Clean: fix typos, handle missing values, remove duplicates.

 Transform: encode categories, scale numbers, derive new features.

 Integrate: merge tables; split into train/validation/test.

Deliverable: Tidy, modelling-ready dataset. (Most time-consuming
step!)
Mini example: Create “term average”, “consistency score”;
impute missing study hours; one-hot encode streams (Sci/Comm/Arts).

7) Modelling

Key question: Which algorithm fits best?

What we do: Train candidate models (e.g., Decision Tree, Logistic
Regression), tune hyperparameters, use cross-validation.
Deliverable: One or more candidate models with training results.
Mini example: Train a Decision Tree to classify Pass/Fail.

8) Evaluation

Key question: Does the model answer the original question well?
What we do: Test on unseen data; use metrics (Accuracy, Precision, Recall,
F1 for classification; MAE/MSE/RMSE for regression). Check against business
success criteria (from Step 1).
Deliverable: Evaluation report + decision (proceed/adjust).
Mini example: Model gets F1 = 0.86; meets target (≥0.80). If not, revisit
features or approach.

9) Deployment

Key question: How does the user get the solution?

What we do: Put the model into use—web app, mobile app, dashboard, API;
decide batch vs real-time; add basic monitoring.
Deliverable: Working solution in the real environment.
Mini example: A simple teacher dashboard flags students needing
support each week.

10) Feedback (Monitoring & Improvement)

Key question: Is the problem solved? What can we improve?

What we do: Gather user feedback, track performance over time, detect
drift, retrain periodically, refine features and rules.
Deliverable: Updated model/process; continuous improvement loop.
Mini example: Teachers report some false alarms after vacations → adjust
the model to consider holiday weeks.

Q2. Explain the importance of Data Understanding and Data

Preparation in the Data Science Methodology with examples.

Answer:

After collecting data, it cannot be directly used for analysis because raw data
is often incomplete, inconsistent, or contains errors. Two important steps at
this stage are Data Understanding and Data Preparation.

1. Data Understanding

o In this step, the data collected is carefully studied.

o The aim is to know what kind of data is available, its format, size,
and quality.
o Example: If a school collects student marks data, we check how
many subjects are there, whether marks are stored as numbers,
and if any values are missing.
o This step helps in deciding whether the data is sufficient for
solving the problem.
2. Data Preparation
o Also called data cleaning or data preprocessing.
o It involves correcting errors, filling missing values, removing
duplicates, and converting data into a useful format.
o Example: In the student marks data, if some records have
“Absent” instead of marks, it must be handled properly (e.g.,
replaced with 0 or marked as missing).
o Without preparation, the model may give wrong or misleading
results.
3. Why these steps are important
o Good quality data leads to reliable results.
o Poor or unclean data leads to wrong conclusions, which can
affect decision-making.
o These steps ensure that the data is accurate, consistent, and
suitable for building models.
👉 Thus, Data Understanding and Preparation are the backbone of the
Data Science process, because “garbage in, garbage out” – if we feed
wrong data, we will get wrong results.

UNIT 3: MAKING MACHINES SEE

1–10: Basics of Computer Vision

1. What does "Computer Vision" mean?

A) Teaching machines to see and understand images
B) Teaching machines to write code
C) Teaching machines to listen to music
D) Teaching machines to play games

2. Which of the following is an example of Computer Vision?

A) Face recognition in mobile phones
B) Listening to songs
C) Reading novels
D) Sending emails

3. Which sense of humans does Computer Vision try to imitate?

A) Hearing
B) Vision
C) Touch
D) Smell

4. Which technology is used in self-driving cars to detect obstacles?

A) Computer Vision
B) Natural Language Processing
C) Cloud Computing
D) Virtual Reality

5. OCR stands for:

A) Optical Character Recognition
B) Object Character Recognition
C) Optical Camera Reading
D) Open Character Reading

6. Which device is commonly used to capture images for computer

vision?
A) Camera
B) Microphone
C) Keyboard
D) Speaker

7. In computer vision, an image is represented as:

A) Rows and columns of pixels
B) Blocks of sound waves
C) Pages of text
D) Series of commands

8. Each pixel in a grayscale image contains:

A) One intensity value
B) Three values (RGB)
C) Only 0 or 1
D) A text label

9. What do RGB stand for in images?

A) Red, Green, Blue
B) Read, Go, Black
C) Random, General, Binary
D) Range, Grid, Brightness

10. Which type of image uses only two colors: black and white?
A) Binary Image
B) Grayscale Image
C) RGB Image
D) Color Image

11–20: Image Types and Features

11. Which type of image has values between 0 to 255 for intensity?
A) Grayscale
B) Binary
C) RGB
D) None
12. A color image (RGB) has how many channels?
A) 3
B) 2
C) 1
D) 4

13. Which of the following is an application of Computer Vision?

A) Detecting diseases in X-ray images
B) Listening to music
C) Writing stories
D) Sending SMS

14. Facial recognition system in airports is an example of:

A) Computer Vision
B) Speech Recognition
C) Data Science
D) Robotics only

15. Number plate detection in traffic cameras is called:

A) Automatic Number Plate Recognition (ANPR)
B) Automatic Name Plate Reading
C) Auto Numeric Processing
D) Artificial Numeric Prediction

16. Which of these is NOT a step in computer vision?

A) Capturing image
B) Processing image
C) Extracting features
D) Listening to songs

17. Edge detection is used to:

A) Find boundaries in images
B) Increase sound quality
C) Write texts faster
D) Store data

18. Which tool/library is widely used in Python for Computer Vision?

A) OpenCV
B) NumPy only
C) MS Word
D) PowerPoint
19. In image processing, filtering is used to:
A) Remove noise
B) Write documents
C) Send emails
D) Play music

20. Convolution is a process mainly used in:

A) Image processing and deep learning
B) Emailing
C) Gaming only
D) Text editing

21–30: Deep Learning & Vision

21. CNN stands for:

A) Convolutional Neural Network
B) Central Neural Node
C) Computer Numeric Network
D) Common Neural Net

22. CNN is widely used for:

A) Image recognition
B) Music composition
C) Text writing
D) Weather reporting

23. In CNN, the convolution layer is used to:

A) Detect features in an image
B) Play games
C) Send messages
D) Translate languages

24. Pooling layer in CNN is used for:

A) Reducing size of feature maps
B) Sending emails
C) Increasing volume
D) Writing code

25. Which of these is NOT an application of Computer Vision?

A) Object Detection
B) Speech-to-Text conversion
C) Face Detection
D) Medical Imaging

26. Object detection means:

A) Identifying and locating objects in an image
B) Counting numbers
C) Detecting sound
D) Writing notes

27. Image classification means:

A) Assigning a label to an image
B) Assigning sound to an image
C) Breaking the image into small pieces
D) Removing pixels

28. Which of these is an everyday use of Computer Vision?

A) Google Lens
B) MS Paint
C) Notepad
D) Excel only

29. Which industry uses computer vision for quality control?

A) Manufacturing
B) Music
C) Literature
D) Tourism

30. In healthcare, computer vision can help in:

A) Detecting tumors in scans
B) Cooking food
C) Playing sports
D) Teaching languages

31–40: Advanced Applications & Ethics

31. Which company uses Computer Vision in self-driving cars?

A) Tesla
B) WhatsApp
C) Spotify
D) Twitter
32. Which social media app uses computer vision for automatic
photo tagging?
A) Facebook
B) WhatsApp
C) Telegram
D) Signal
33. Gesture recognition allows computers to:
A) Understand hand and body movements
B) Understand smell
C) Understand music
D) Understand novels
34. Which of these is a challenge in computer vision?
A) Poor image quality
B) Bright lighting always
C) Correct grammar
D) High volume
35. CAPTCHA test on websites uses:
A) Computer Vision & Pattern Recognition
B) Music recognition
C) Text summarization
D) Story writing
36. Which of these is NOT a concern in Computer Vision?
A) Privacy issues
B) Data security
C) Image accuracy
D) Cooking recipes
37. Which application uses Computer Vision in sports?
A) Tracking ball movement in cricket/football
B) Reading books aloud
C) Generating poems
D) Playing songs
38. Augmented Reality (AR) uses Computer Vision for:
A) Overlaying digital images on real world
B) Sending text messages
C) Cooking recipes
D) Playing audio
39. Which of the following is an example of biometric system using
Computer Vision?
A) Face unlock in smartphones
B) Fingerprint sensor
C) Voice recognition
D) Password typing
40. The main goal of Computer Vision is:
A) To enable machines to see, process, and understand images like
humans
B) To write emails faster
C) To play games
D) To talk like humans

Answer Key
1–10: Basics of Computer Vision
1. A) Teaching machines to see and understand images
2. A) Face recognition in mobile phones
3. B) Vision
4. A) Computer Vision
5. A) Optical Character Recognition
6. A) Camera
7. A) Rows and columns of pixels
8. A) One intensity value
9. A) Red, Green, Blue
10. A) Binary Image

11–20: Image Types and Features

11. A) Grayscale
12. A) 3
13. A) Detecting diseases in X-ray images
14. A) Computer Vision
15. A) Automatic Number Plate Recognition (ANPR)
16. D) Listening to songs
17. A) Find boundaries in images
18. A) OpenCV
19. A) Remove noise
20. A) Image processing and deep learning

21–30: Deep Learning & Vision

21. A) Convolutional Neural Network
22. A) Image recognition
23. A) Detect features in an image
24. A) Reducing size of feature maps
25. B) Speech-to-Text conversion
26. A) Identifying and locating objects in an image
27. A) Assigning a label to an image
28. A) Google Lens
29. A) Manufacturing
30. A) Detecting tumors in scans

31–40: Advanced Applications & Ethics

31. A) Tesla
32. A) Facebook
33. A) Understand hand and body movements
34. A) Poor image quality
35. A) Computer Vision & Pattern Recognition
36. D) Cooking recipes
37. A) Tracking ball movement in cricket/football
38. A) Overlaying digital images on real world
39. A) Face unlock in smartphones
40. A) To enable machines to see, process, and understand images
like humans

II.DESCRIPTIVE ANSWERS:
1.Explain the five stages of the Computer Vision process with
examples.
Answer:
The Computer Vision (CV) process allows machines to “see” and interpret
visual data such as images and videos. It generally follows five main
stages:

1. Image Acquisition
 First step: capturing or obtaining a digital image or video.
 Sources: digital cameras, scanners, medical devices (like MRI, CT
scans), or design software.
 Quality depends on camera resolution, lighting, and angle.
 Example: A CCTV camera captures images of people entering a
building.

2. Preprocessing
 Raw images may contain noise, distortions, or uneven brightness.
Preprocessing prepares them for analysis.
 Common techniques:
o Noise Reduction: Removes unwanted spots or blur.
o Normalization: Adjusts pixel values to a fixed range.
o Resizing/Cropping: Makes all images the same size.
o Histogram Equalization: Improves brightness and contrast.
 Example: Cleaning up a blurry passport photo for verification.

3. Feature Extraction
 This step identifies important patterns or attributes from the
image.
 Techniques include:
o Edge Detection: Finds boundaries of objects.
o Corner Detection: Finds points where edges meet.
o Texture Analysis: Recognizes smoothness, roughness, or
repeated patterns.
o Colour-based Features: Helps in distinguishing objects by
colour.
 Example: A smartphone camera detecting a person’s face by locating
eyes, nose, and mouth edges.

4. Detection/Segmentation
 Detects objects or regions of interest in the image.
 Two types:
1. Classification: Identifies what the object is (e.g., “This is a
dog”).
2. Classification + Localization: Identifies the object and its
position using a bounding box.
3. Object Detection: Finds and labels multiple objects in one
image.
4. Image Segmentation: Separates pixels into regions for better
understanding.
 Semantic Segmentation: Groups similar pixels into one
class (e.g., all trees).
 Instance Segmentation: Differentiates between
individual objects of the same class (e.g., three separate
dogs).
 Example: In a self-driving car, segmentation helps identify lanes,
pedestrians, and traffic lights.

5. High-Level Processing
 Final stage: interpreting results to make decisions.
 The system recognizes objects, understands context, and analyses
scenes.
 Example: In healthcare, after detecting a tumour in a scan, the
system suggests whether it is likely cancerous.
Q2. What are the Applications of Computer Vision in real life?
Explain with examples.
Answer:
Computer Vision (CV) is one of the most important fields of Artificial
Intelligence. It helps machines not only to see images but also to analyze,
understand, and take decisions based on visual data. Today, Computer
Vision is applied in almost every sector. Some major applications are:

1. Healthcare and Medical Imaging

 CV helps in detecting diseases early through medical scans like X-
rays, MRI, and CT scans.
 It can identify tumors, fractures, or infections that doctors may miss.
 Example: AI-based systems detect early signs of cancer from
mammogram images.

2. Self-Driving Cars
 Autonomous vehicles use computer vision to see and interpret the
road.
 CV detects pedestrians, other vehicles, road signs, and traffic lights.
 It helps the car in lane detection, obstacle avoidance, and safe
navigation.
 Example: Tesla’s Autopilot uses cameras and CV algorithms for
decision-making.

3. Surveillance and Security

 CV is widely used in CCTV cameras and monitoring systems.
 It can recognize faces, detect suspicious behavior, and even track
objects in real-time.
 Example: Airports use face recognition systems for passenger
verification.

4. Retail and E-commerce

 CV helps in automated checkout systems, stock management, and
customer behavior analysis.
 Online shopping platforms use CV for visual search – a customer can
upload a photo to find similar products.
 Example: Amazon Go stores use CV to allow customers to shop without
cashiers.

5. Agriculture
 Farmers use CV to monitor crop health, detect pests, and analyze
soil conditions.
 Drones with cameras capture images of fields, and CV algorithms
detect problems.
 Example: CV detects whether leaves of crops show disease symptoms.

6. Entertainment and Augmented Reality (AR)

 CV is used in video games, movies, and filters on social media
apps.
 AR applications overlay digital images on real-world objects using CV.
 Example: Snapchat and Instagram filters use facial landmark detection.

7. Industrial Automation
 In factories, CV helps in quality inspection and ensures products
meet standards.
 Robots use CV to identify, pick, and assemble objects on production
lines.
 Example: Detecting defective chips in an electronics factory.

A.reshape, Resize
No ratings yet
A.reshape, Resize
7 pages
7 - Introduction To Data Science in Python
No ratings yet
7 - Introduction To Data Science in Python
7 pages
Unit-II Data Science QB
No ratings yet
Unit-II Data Science QB
33 pages
DAI 101 Tutorial 2 - Solution
No ratings yet
DAI 101 Tutorial 2 - Solution
12 pages
20ca2204 Data Science QB With Answers
No ratings yet
20ca2204 Data Science QB With Answers
48 pages
Top Python Questions 1735201448
No ratings yet
Top Python Questions 1735201448
25 pages
Data Science 150 MCQ
No ratings yet
Data Science 150 MCQ
16 pages
Pandas Guide for Data Professionals
No ratings yet
Pandas Guide for Data Professionals
15 pages
Python MCQs Test Papers Expanded
No ratings yet
Python MCQs Test Papers Expanded
7 pages
DAI 101 Tutorial
No ratings yet
DAI 101 Tutorial
12 pages
Class XII Informatics Test
No ratings yet
Class XII Informatics Test
6 pages
DataFrame QP
No ratings yet
DataFrame QP
17 pages
Dtea SR Sec Schools, New Delhi Worksheet Assignment Class XII-Informatics Practices
No ratings yet
Dtea SR Sec Schools, New Delhi Worksheet Assignment Class XII-Informatics Practices
3 pages
Python Libraries for Data Analysis
No ratings yet
Python Libraries for Data Analysis
4 pages
Python NumPy and Pandas MCQs
No ratings yet
Python NumPy and Pandas MCQs
8 pages
Ip Practice Test (14in)
No ratings yet
Ip Practice Test (14in)
9 pages
IP - Pandas 1 & 2 (Worksheet) Class 12
No ratings yet
IP - Pandas 1 & 2 (Worksheet) Class 12
16 pages
Pandas
No ratings yet
Pandas
5 pages
Worksheet-1 (Python)
No ratings yet
Worksheet-1 (Python)
9 pages
More Practice Questions For DataFrame
No ratings yet
More Practice Questions For DataFrame
9 pages
60 Python Interview Qs Every Data Analyst Must Know
No ratings yet
60 Python Interview Qs Every Data Analyst Must Know
11 pages
Ipqppt1 24-25kvamc
No ratings yet
Ipqppt1 24-25kvamc
3 pages
Data Handling and SQL Concepts Guide
No ratings yet
Data Handling and SQL Concepts Guide
26 pages
Python Numpy and Pandas Interview Questions
No ratings yet
Python Numpy and Pandas Interview Questions
16 pages
12 IP Dataframe and Pyplot Notes
No ratings yet
12 IP Dataframe and Pyplot Notes
14 pages
Pandas
No ratings yet
Pandas
12 pages
Holy Innocents Public School Term-1
No ratings yet
Holy Innocents Public School Term-1
6 pages
GR Xii Ip Pandas Worksheet
No ratings yet
GR Xii Ip Pandas Worksheet
6 pages
Python Unit 2 Question Bank
No ratings yet
Python Unit 2 Question Bank
5 pages
1043xii Ip Cbse Old Questions Information Practices
No ratings yet
1043xii Ip Cbse Old Questions Information Practices
12 pages
DXE 24gksmknvj
No ratings yet
DXE 24gksmknvj
16 pages
Pandas & Vis 1
No ratings yet
Pandas & Vis 1
25 pages
? Sample Paper by Aadish
No ratings yet
? Sample Paper by Aadish
7 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
Kendriya Vidyalaya Jetpur IP Test 1
No ratings yet
Kendriya Vidyalaya Jetpur IP Test 1
6 pages
XII C IP Summer Break Holiday Home Work by Rahul Lakra
No ratings yet
XII C IP Summer Break Holiday Home Work by Rahul Lakra
9 pages
Data Analysis 6060
No ratings yet
Data Analysis 6060
6 pages
12 Ip Dataframes Notes
No ratings yet
12 Ip Dataframes Notes
7 pages
Ip Study
No ratings yet
Ip Study
18 pages
PYQ Data Analysis and Visualisation Using Python GE May 2024
No ratings yet
PYQ Data Analysis and Visualisation Using Python GE May 2024
6 pages
2020-21 XIIInfo - Pract.S.E.155
No ratings yet
2020-21 XIIInfo - Pract.S.E.155
11 pages
Ip Sample Paper 2
No ratings yet
Ip Sample Paper 2
6 pages
Informatic Practices HHW
No ratings yet
Informatic Practices HHW
59 pages
MY Question Bank
100% (1)
MY Question Bank
3 pages
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
No ratings yet
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
16 pages
100 Python Interview Questions
100% (1)
100 Python Interview Questions
68 pages
22cs701-Spm Unit 4
No ratings yet
22cs701-Spm Unit 4
2 pages
Work Sheet-1 Class 12 IPR
No ratings yet
Work Sheet-1 Class 12 IPR
5 pages
DAVPy 2024GE
No ratings yet
DAVPy 2024GE
12 pages
Ai ML Unit 2
No ratings yet
Ai ML Unit 2
15 pages
Informatics Practices Book 12 Answer Key
No ratings yet
Informatics Practices Book 12 Answer Key
54 pages
Num Py Pandas Interview Qa
No ratings yet
Num Py Pandas Interview Qa
7 pages
Dataframe
No ratings yet
Dataframe
2 pages
22mbada303 Module 4
No ratings yet
22mbada303 Module 4
32 pages
Assignment-1 (Python Pandas-Series Object and Data Frame: 1. Answer The Following
100% (1)
Assignment-1 (Python Pandas-Series Object and Data Frame: 1. Answer The Following
8 pages
Xii-Ip Quarterly Exam Ms 25-26
No ratings yet
Xii-Ip Quarterly Exam Ms 25-26
8 pages
Ip pb1 QP Ms Agra Set A
No ratings yet
Ip pb1 QP Ms Agra Set A
17 pages
August Cycle Test Anskey Class 12
No ratings yet
August Cycle Test Anskey Class 12
3 pages
Physics
No ratings yet
Physics
2 pages
Class 11 Ip PT 2 (Ak-2)
No ratings yet
Class 11 Ip PT 2 (Ak-2)
6 pages
Class 11 Ip PT 2 (Qp-2)
No ratings yet
Class 11 Ip PT 2 (Qp-2)
5 pages
SQL Question
No ratings yet
SQL Question
11 pages
UNIT 7 Book Back
No ratings yet
UNIT 7 Book Back
2 pages
The Magic Humming
No ratings yet
The Magic Humming
1 page
The Pool Game
No ratings yet
The Pool Game
2 pages
Hybrid Learning Fundamentals Certification Assessment
No ratings yet
Hybrid Learning Fundamentals Certification Assessment
8 pages
Code of Conduct - Revised 2025
No ratings yet
Code of Conduct - Revised 2025
14 pages
Understanding Privacy: Types and Laws
No ratings yet
Understanding Privacy: Types and Laws
15 pages
GPB 180
No ratings yet
GPB 180
1 page
Unit-4 - Confidence Interval and CLT
No ratings yet
Unit-4 - Confidence Interval and CLT
29 pages
BGI Group - Aon Assessments - 4th Semester - 2027 Batch
No ratings yet
BGI Group - Aon Assessments - 4th Semester - 2027 Batch
9 pages
Introduction To Uml 2
No ratings yet
Introduction To Uml 2
6 pages
Medium Voltage Switchgear Overview
No ratings yet
Medium Voltage Switchgear Overview
44 pages
Design of In-Situ Soil Mixing - PDF
No ratings yet
Design of In-Situ Soil Mixing - PDF
8 pages
SBSTC E-Ticket for Durgapur to Karunamoyee
No ratings yet
SBSTC E-Ticket for Durgapur to Karunamoyee
1 page
Trellix Advanced Threat Research Report January 2022
No ratings yet
Trellix Advanced Threat Research Report January 2022
16 pages
4.6.5 Packet Tracer - Connect A Wired and Wireless Lan - en XL
No ratings yet
4.6.5 Packet Tracer - Connect A Wired and Wireless Lan - en XL
4 pages
BDW 800
No ratings yet
BDW 800
3 pages
Synchronous Rectification National Power Designer 112
No ratings yet
Synchronous Rectification National Power Designer 112
8 pages
Sistem Manajemen Basis Data
No ratings yet
Sistem Manajemen Basis Data
5 pages
B.B. King Lucille B.B. King Super Lucille
No ratings yet
B.B. King Lucille B.B. King Super Lucille
2 pages
Deep Residual U Net For Automatic Detection OfMoroccan Coastal Upwelling Using SST Images
No ratings yet
Deep Residual U Net For Automatic Detection OfMoroccan Coastal Upwelling Using SST Images
5 pages
How To Remove A Kafka Broker From Confluent Cluster
No ratings yet
How To Remove A Kafka Broker From Confluent Cluster
3 pages
RAS English Course
No ratings yet
RAS English Course
55 pages
Data Analytics For Decision Making
No ratings yet
Data Analytics For Decision Making
8 pages
TSB wk2 1900411
No ratings yet
TSB wk2 1900411
2 pages
17 Bid BUYDMD031617 Section II
No ratings yet
17 Bid BUYDMD031617 Section II
4 pages
HPE Reference Architecture For Digital Workspace On HPE Synergy Composable Infrastructure
No ratings yet
HPE Reference Architecture For Digital Workspace On HPE Synergy Composable Infrastructure
57 pages
The Ultimate Guide To Getting Free Instagram Followers in 2024
No ratings yet
The Ultimate Guide To Getting Free Instagram Followers in 2024
4 pages
Name Shubham Mali
No ratings yet
Name Shubham Mali
11 pages
Secondary Storage Devices
No ratings yet
Secondary Storage Devices
26 pages
Steps of Document Verification For Ug
No ratings yet
Steps of Document Verification For Ug
7 pages
Problem No: 1: Submitted To
No ratings yet
Problem No: 1: Submitted To
4 pages
4M Change Management
No ratings yet
4M Change Management
1 page
CMSS Action-Plan
No ratings yet
CMSS Action-Plan
2 pages
Database Management System Fundamentals
No ratings yet
Database Management System Fundamentals
114 pages