Computer Vision Question & Answers
Question 1: What is the output colour when you put R = G = B = 255 ?
Ans. For the RGB color model, When R=255, G=255, and B=255, the output color is white.
Question 2: What is the output colour when you put R=G=B=0 ?
Ans. When you set R( Red )=0, G( Green )=0, and B( Blue )= 0 , the output color is black. In the RGB
color model, (0,0, 0 ) represents the absence of light, resulting in black.
Question 3.
How does the colour vary when you put either of the three as 0 and then keep on varying the other
two?
Answer:
Red = 0, vary Green and Blue:
Colors will range from black to various shades of cyan (turquoise-like colors).
Example: (0,255,255) is cyan, (0,0,0) is black, (0,128, 128) is a darker cyan.
Green = 0, vary Red and Blue:
Colors will range from black to various shades of magenta and purple.
Example: (255,0,255) is magenta, (0,0,0) is black, (128, 0,128 ) is a darker purple.
Blue = 0, vary Red and Green:
Colors will range from black to various shades of yellow and green.
Example: (255,255,0) is yellow, (0,0,0) is black, (128, 128,0) is olive.
Question 4.
How does the output colour change when all the three colours are varied in same proportion?
Answer:When the RGB (Red, Green, Blue) values are varied equally, the resulting color moves along
a gradient from black to white through shades of gray. This means that increasing all three values
equally will make the color lighter, while decreasing them will make the color darker, without
changing the hue. For example: (0,0,0) is black.
(128,128,128) is a medium gray.
(255,255,255) is white.
Question 5.
What is the RGB value of your favouite colour from the colour palette?
Answer:
The RGB value of light blue is (173,216,230).
Question 6.
What is a Kernel?
Answer:
A Kernel is a matrix, which is slid across the image and multiplied with the input such that the
output is enhanced in a certain desirable manner. Each kernel has a different value for different
kind of effects that we want to apply to an image.
Question 7.What is a Neural Network?
Ans. A Neural Network is a computational model inspired by biological neural networks in the brain
process information. It consists of interconnected nodes (neurons) arranged in layers.
The three main layers are:
1. Input Layer Receives the initial data or features to be processed.
2. Hidden Layer Performs computations and feature extraction; there can be multiple hidden
layers.
3. Output Layer Produces the final prediction or classification result.
Each neuron in a layer is connected to neurons in the next layer, with weights that are adjusted
during training to minimize error and improve accuracy.
Question 1.It is basically the dimensions through which you can measure how many pixels are there
on a screen. What is it?
Answer:Resolution
Question 2.What is the process of detecting an instance of the object, categorising it and giving each
pixel a label on that basis?
Answer:Instance segmentation
Question 3.With reference to AI domain, expand the term CV.
Answer: CV stands for Computer Vision.
Question 4.Which app uses face filters?
Answer: Snapchat uses face filters.
Question 5.Give an example of object recognition.
Answer:Object recognition is identifying specific instances of objects and associating them with
known categories or concepts.
Question 6.Mention any 2 features of image based on which they are characterised.
Answer:Two features based on which images are characterised are – textures and edges
Question 7.How many channels does a color image have?
Answer:Three channels
Question 8.What is a kernel?
Ans. A kernel, is a small matrix of numerical values that defines a specific operation to be performed
on the input image.
Question 9.What is OpenCV?
Answer: OpenCV is an open-source computer vision and machine learning software library that
provides various tools and functions for image and video processing.
Question 1.
Name the four important layers in CNN.
Ans. The 4 layers in CNN are:
1. Convolution layer
2. Rectified linear unit
3. Pooling layer
4. Fully connected layer
Question 2.
Differentiate between Max pooling and Average pooling.
Answer:
The difference between max pooling and average pooling is:
Max pooling is the most used method that selects the maximum value of the current image view and
helps preserve the maximum detected features.
Whereas Average pooling finds out the average value of the current image view and thus down
samples the feature map.
Question 3.
Name the real-life applications of computer vision.
Ans:The real-life applications of computer vision are:
Facial recognition in smartphones
Face filters
Google translate app
Question 4.
How does the computer see an image?
Ans. In a digital image, every coloured image that is stored can be split into 3 different channels, i.e.
Red (R), Green (G) and Blue (B), with different intensities. The computer identifies the value related
to each pixel and determines the size and colour of the image.
Question 5.
How computer vision is helping in medical imaging?
Answer:
Computer vision plays a pivotal role in medical imaging, facilitating tasks such as image analysis,
diagnosis, treatment planning, and surgical guidance across various medical specialties. By
analysing medical images, including X-rays, CT scans, MRIs, and ultrasound images, computer
vision algorithms assist healthcare professionals in detecting abnormalities, quantifying disease
progression, identifying anatomical structures, and guiding interventions.
Question 6. How Google translate uses computer vision?
Ans. Google Translate employs computer vision to enable instant translation of text captured
through a smartphone camera. Using optical character recognition (OCR) technology, the app
recognises and extracts text from images, such as signs, menus, or documents. Then, machine
translation algorithms process the extracted text to provide translations in real-time, facilitating
communication across languages.
Question 7. Explain the term resolution with an example.
Ans. Resolution of an image refers to the number of pixels in an image, across the width and height.
For example a monitor resolution of 1280 × 1024. This means there are 1280 pixels from one side
to the other, and 1024 pixels from top to bottom.
Question 8. Face lock feature of a smartphone is an example of computer vision. Briefly discuss this
feature.
Ans. The face lock feature of a smartphone is indeed an example of computer vision technology in
action. It utilises the device’s front-facing camera and sophisticated algorithms to recognise and
authenticate the user’s face before granting access to the device.
Question 9. Can you define “digital image”?
Ans. A digital image is a picture that’s made up of smaller parts, called pixels. These pixels are made
of numerical components that represent their color codes and intensity. AI systems us these
numbers to understand an image.
Question 10. Differentiate between grayscale and RGB images.
Ans.
Grayscale image RGB image
Grayscale images contain shades of gray, ranging from RGB (Red, Green, Blue) images are composed of three color
black to white, with no color information. channels: red, green, and blue.
Each pixel in an RGB image is represented by a combination
Each pixel in a grayscale image is represented by a
of three intensity values, one for each color channel.
single intensity value, typically an 8 -bit value ranging
Typically, each intensity value is an 8-bit integer ranging
from 0 (black) to 255 (white), where intermediate
from 0 to 255 , resulting in 24 bits per pixel (8 bits per
values represent varying shades of gray.
channel).
Question 11. What’s the purpose of grayscaling?
Ans:Grayscale is the range of whiteness to blackness of a digital image. Programmers take an image
that’s in color and change it to grayscale, which is called grayscaling. This helps simplify the image
data so a computer can more easily process the input.
Question 12. What programming languages does computer vision support?
Ans:Computer vision can use programming languages such * as Java, C/C++, Prolog, Python and
LISP. I’ve primarily used Java in past projects, but I hàve certification in Prolog as well and basic
understanding of the others.
Computer Vision Class 10 Long Answer Type Questions
Question 1. Explain the different layers of convolution neural network.
Ans. Refer to text on page 136-137 (Layers of CNN)
Question 2. Define the computer vision task.
Ans.Refer to text on page 134 (Computer vision tasks)
Question 3. What is image? Also, Explain the basics of image.
Ans. Image is a visual representation of an object, a scene or a concept.
Basics of image Refer to text on page 134-135 (Basics of images)
Computer Vision Class 10 Activities
Imagine that your security camera is capturing an image. At the top of the image, we are given six
small patches of image. Take a pencil and mark the exact location of those patches in the image.
(a) Were you able to find the exact location of all the patches?
(b) Which one was the most difficult to find?
(c) Which one was the easiest to find?
Conclusion The unique features of the images are the easiest to find, it may the corners, start of the
image or the end points of the image.