0% found this document useful (0 votes)
247 views5 pages

Part B Unit 5 - Computer Vision - Notes

The document outlines various aspects of computer vision, including its applications in smartphone technology, medical imaging, and workplace safety. It discusses key concepts such as image resolution, grayscale and RGB images, convolution in image processing, and the structure of Convolutional Neural Networks. Additionally, it addresses practical scenarios in agriculture and self-driving cars, emphasizing the importance of computer vision in enhancing efficiency and safety.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
247 views5 pages

Part B Unit 5 - Computer Vision - Notes

The document outlines various aspects of computer vision, including its applications in smartphone technology, medical imaging, and workplace safety. It discusses key concepts such as image resolution, grayscale and RGB images, convolution in image processing, and the structure of Convolutional Neural Networks. Additionally, it addresses practical scenarios in agriculture and self-driving cars, emphasizing the importance of computer vision in enhancing efficiency and safety.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Class X – Artificial Intelligence

Part B: Unit – 5 Computer Vision


Homework questions:
1. Face lock feature of a smartphone is an example of computer vision. Briefly discuss this
feature. (CBSE 2023)
2. Give any two key impacts of Computer Vision on Medical Imaging (CBSE 2025)
3. Define the terms: Pixel and Pixel Value.
4. Enlist two smartphone apps that utilize computer vision technology? How have these apps
improved your efficiency or convenience in daily tasks?
5. “Imagine you're a researcher tasked with improving workplace safety in a manufacturing
environment. You decide to employ computer vision technology to enhance safety measures.”
How would you utilize computer vision in two distinct applications to promote safety within
the manufacturing plant, ensuring both the physical well-being of employees and the efficiency
of operations?
Provide detailed explanations for each application, including the specific computer vision
techniques or algorithms you would employ, and how they would contribute to achieving your
safety goals.

Answer the following questions (02 marks):


1. Differentiate between grayscale and RGB images. (CBSE 2024)
Ans: There are two primary digital picture formats that are used for distinct purposes: grayscale
and RGB (red, green, or blue). Images in grayscale consist of black and white pixels, each of
which has a grey color intensity. However, RGB images can exhibit a wide variety of hues
since they are made up of three color channels: red, green, and blue.

2. Explain the term resolution with an example.


Ans: The number of pixels in an image is called the resolution. The term is used to describe
the pixel count in an image. There are two conventions to express the resolution of an image:
1) to express the number of pixels as the width by height of the image: for example, a
monitor has a resolution of 1280×1024. This means there are 1280 pixels from one side
to the other, and 1024 from
top to bottom.
2) to express the number of pixels as a single number, like a 5 mega pixel camera
(a megapixel is a million pixels). This means the pixels along the width multiplied by
the pixels along the height of the image taken by the camera equals 5 million pixels.

3. What are grayscale images?


Ans: Grayscale images are images which have a range of shades of gray without apparent
colour. The darkest possible shade is black, which is the total absence of colour or zero value
of pixel. The lightest possible shade is white, which is the total presence of colour or 255
value of a pixel.
A grayscale has each pixel of size 1 byte having a single plane of 2d array of pixels. The size
of a grayscale image is defined as the Height x Width of that image.

4. What is an RGB image? Why is it called so?


Ans: RGB images are made up of three primary colours Red, Green and Blue. All the colours
that are present can be made by combining different intensities of red, green and blue.
Every RGB image is stored in the form of three different channels called the R channel, G
channel and the B channel. Each plane separately has a number of pixels with each pixel
value varying from 0 to 255. All the three planes when combined together form a colour image.
Class X – Artificial Intelligence
Part B: Unit – 5 Computer Vision
This means that in a RGB image, each pixel has a set of three different values which together
give colour to that particular pixel.
5. Explain the term Image Convolution.
Ans: Convolution is a simple Mathematical operation which is fundamental to many common
image processing operators. Convolution provides a way of `multiplying together' two arrays
of numbers, generally of different sizes, but of the same dimensionality, to produce a third array
of numbers of the same dimensionality.
An (image) convolution is simply an element-wise multiplication of image arrays and another
array called the kernel followed by sum.

6. What is an image feature?


Ans: In computer vision and image processing, a feature is a piece of information which is
relevant for solving the computational task related to a certain application. Features may be
specific structures in the image such as points, edges or objects
7. Briefly describe the purpose of the convolution operator in image processing.

Answer the following questions (04 marks):

1. What are the different layers in Convolutional Neural Network? What features are
likely to be detected by the initial layers of a neural network and how is it different from
what is detected by the later layers?
Ans: A Convolutional Neural Network (CNN) is a Deep Learning algorithm which can take
in an input image, assign importance (learnable weights and biases) to various aspects/objects
in the image and be able to differentiate one from the other.
A convolutional neural network consists of the following layers:
1) Convolution Layer: It is the first layer of a CNN. The objective of the Convolution
Operation is to extract the high-level features such as edges, from the input image. It uses
convolution operation on the images. The output of this layer is called the feature map.
There’s several uses we derive from the feature map:
• We reduce the image size so that it can be processed more efficiently.
• We only focus on the features of the image that can help us in processing the image further.

2) Rectified linear Unit (ReLU): The next layer in the Convolution Neural Network is the
Rectified Linear Unit function or the ReLU layer. After we get the feature map, it is then
passed onto the ReLU layer. This layer simply gets rid of all the negative numbers in the
feature map and lets the positive number stay as it is.

3) Pooling Layer: Similar to the Convolutional Layer, the Pooling layer is responsible for
reducing the spatial size of the Convolved Feature while still retaining the important features.
The pooling layer is an important layer in the CNN as it performs a series of tasks which are
as follows:
a) Makes the image smaller and more manageable
b) Makes the image more resistant to small transformations, distortions and translations
in the input image.

4) Fully Connected Layer: The final layer in the CNN is the Fully Connected Layer (FCP).
The objective of a fully connected layer is to take the results of the convolution/pooling
process and use them to classify the image into a label (in a simple classification example).
Class X – Artificial Intelligence
Part B: Unit – 5 Computer Vision

2. Explain the distinctions between image classification, classification with localization,


object detection, and instance segmentation in computer vision tasks. Provide examples
for each to support your answer.
Ans: The tasks used in a computer vision application are:

Classification:
Image Classification problem is the task of assigning an input image one label from a fixed set
of categories.
Eg: A banking app that detects if a photo is of a check or not.
Classification + Localisation:
This is the task which involves both processes of identifying what object is present in the image
and at the same time identifying at what location that object is present in that image.
Eg: A home security camera that detects and highlights a package on the doorstep.

Object Detection:
Object detection is the process of finding instances of real-world objects such as faces, bicycles,
and buildings in images or videos. Object detection algorithms typically use extracted features
and learning algorithms to recognize instances of an object category.
Eg: A self-driving car detecting pedestrians, traffic lights, and other cars.

Instance Segmentation:
Instance Segmentation is the process of detecting instances of the objects, giving them a
category and then giving each pixel a label on the basis of that. A segmentation algorithm takes
an image as input and outputs a collection of regions (or segments).
Eg: A medical imaging system identifying and outlining individual tumors in an MRI scan.

Case Based Questions:


1. “Agriculture is an industry where precision and efficiency are crucial for
sustainable production. Traditional farming methods often rely on manual labor
and visual inspection, which can be time- consuming and error-prone. However,
advancements in computer vision technology offer promising solutions to optimize
various agricultural processes.
Agricultural drones equipped with high-resolution cameras and computer vision
algorithms are increasingly being used to monitor crop health, detect diseases, and assess
crop yields.”
Answer the following questions based on the case study mentioned above:
a) How does the integration of computer vision technology with drones improve
efficiency in agricultural practices compared to traditional methods?
Class X – Artificial Intelligence
Part B: Unit – 5 Computer Vision
Ans:
• Traditional farming relies on manual inspection, which is slow, tiring, and covers only
a small part of the field.
• Drones with computer vision can scan large farms quickly and give a full picture of
crop health.

• They allow frequent monitoring (even daily), which helps farmers spot problems
early.
• Data is objective and accurate, unlike human judgment which may vary.
• Farmers can use the results for precision farming – applying water, fertilizer, or
pesticide only where needed, saving time, cost, and reducing waste.

b) What are some key indicators or parameters that computer vision algorithms
can analyze to assess crop health and detect diseases?

Ans: Computer vision can check:


• Leaf color – yellowing can mean lack of nutrients.
• Leaf spots or patterns – can signal disease or pest attacks.
• Canopy cover (density of plants) – shows how well crops are growing.
• Plant temperature (with thermal cameras) – high temperature may mean water
stress.
• Growth stage – checking if crops are flowering or ready for harvest.
• Weed detection – spotting unwanted plants among crops.

2. You are tasked with developing a computer vision system for a self-driving car
company. The system needs to accurately detect and classify various objects on the
road to ensure safe navigation. Imagine you're working on improving the object
detection algorithm for the self-driving car's computer vision system. During
testing, you notice that the system occasionally misclassifies pedestrians as cyclists,
especially in low-light conditions.
How would you approach addressing this issue? What steps would you take to
enhance the accuracy of pedestrian detection while ensuring the system's overall
performance and reliability on the road?
Ans: If the system is confusing pedestrians with cyclists at night, I would improve it step by
step:
1. Better Training Data:
Collect more images of pedestrians and cyclists in low-light conditions (night, fog,
rain, streetlights). This helps the system learn the difference even when visibility is
poor.
2. Image Enhancement:
Use techniques to make dark images clearer, like brightness adjustment or noise
reduction, so the computer can “see” better.
3. Improved Algorithms:
Add a pose detection algorithm (to see human body shapes) so the system can tell if
a person is walking or riding a bike.
Class X – Artificial Intelligence
Part B: Unit – 5 Computer Vision
4. Use Multiple Sensors:
If possible, combine camera vision with other sensors like LiDAR or radar. These can
detect shapes and movement even in the dark.
5. Safer Decisions:
If the system is not sure whether it’s a pedestrian or cyclist, it should always assume
“pedestrian” because that’s safer.

You might also like