Computer vision technology is crucial part of AI, that aids in
building a machine with ability to look at an image/video, to
understand it, and to respond to it.
Challenges in Computer Vision:
Image is stored as vector array in digital form. Deep learning
techniques are required to get insights from this data.
A very huge data set would be required to train the system
to identify objects at various angles/environmental
conditions.
Time based decision making. Example: Alert has to be
generated by a surveillance robot, when someone crosses a
railway line and a train is approaching, otherwise, it should
be considered normal.
In case of living objects, ability to differentiate between, the
live object, a statue of the object, a life size poster/photo of
the object.
Understanding the object with its context
What is Computer Vision ?
Computer vision tasks include methods
for acquiring, processing, analyzing and understanding digital
images, and extraction of high-dimensional data from the real
world in order to produce numerical or symbolic information, e.g.
in the forms of decisions
PURPOSE OF COMPUTER VISION
Object Classification
Object Identification
Object Verification
Object Detection
Object Landmark Detection
Object Segmentation
Object Recognition
[Link]
lex_auth_012983793978515456273_shared/web-hosted/assets/[Link] https://
[Link]/common-content-store/Shared/Shared/Public/
lex_auth_012983793978515456273_shared/web-hosted/assets/[Link]
Visualization
The purpose is to observe the objects that are not visible in an
image
Image Sharpening and Restoration
The purpose is to create a better image
Image Retrieval
The purpose is to seek for the image of interest
Measurement of Pattern
The purpose is to measure various objects in an image
Image Recognition
The purpose is to distinguish the objects in an image
Changing color spaces:
Color spaces define how the range value of pixels are represented
for different mediums. Some of the widely used color spaces
include RGB, HSV, CMYK, LAB color space, YCrCb color space, etc.
Example: By default, digital images will be in RGB (Red, Blue,
Green) color space. In RGB color space, the pixels have only the
color components information, and has no direct details about
brightness or saturation intensity. It would be pose complexity if
we wish to process the brightness of the image. Similarly,
some kind of processing could provide poor results in some color
spaces. Hence, we might choose to convert it to a different color
space based on the requirement.
CMYK - Cyan, Magenta, Yellow and blacK - The color space which
is widely used in printers.
HSV - Hue, Saturation, Value - preferred color space in high
quality graphics, due to the property that, it has separate
components to denote color, intensity and brightness. So it is
easy to maintain the color and just alter the brightness in the
images.
Geometric Transformations:
As the name indicates, geometric transforms refers to altering the
geometric aspects of the images like size, orientation, etc.,
without affecting the actual contents of the image. Some of the
geometric transformation techniques are listed below:
Scaling:
Generally images are scaled down to minimize computational
time and resources, while taking to consideration that, the feature
details are not lost.
Rotation/Reflection/Translation:
Images will be rotated in different angles so that it would be
analysed in different perspectives.
Mirroring can also be done to obtain insights about the image.
mage translation refers to moving pixels in an image from one
position to another, with aim of changing an objects position/
angle.
IMAGE SEGMENTATION
In computer vision, segmentation is the process of extracting
pixels in an image that are related. Segmentation algorithms
usually take an image and produce a group of contours (the
boundary of an object that has well-defined edges in an image) or
a mask where a set of related pixels are assigned to a unique
color value to identify it.
The main purpose for Image segmentation is to partition an
image into a collection of set of pixels and achieve the following
results for
– Meaningful regions (coherent objects)
– Linear structures (line, curve, …)
– Shapes (circles, ellipses, …)
The very simple example for image segmentation is method
based on thresholding.
Binary Segmentation
You have an image, and you want to segment into 2 part (light
and dark). So you can use the threshold is a value, for example,
100. So after segmenting, your image will be segmented into 2
part: the pixels with higher than 100 intensity and the pixels (you
can set value for it 255 - is maximum intensity value, white color)
with less than 100 intensity (corresponding intensity is 0 -
minimum intensity value, black color). This is call binary
segmentation. If you use more thresholds, you have more
segments.
Feature
A feature is a piece of information which is relevant for solving the
computational task related to a certain application. Features may
be specific structures in the image such as points, edges or
objects. Features may also be the result of a general
neighborhood operation or feature detection applied to the
image.
Main Component Of Feature Detection And Matching
Detection:
Identify the Interest Point in the image.
The features that are in specific locations of the images, such as
mountain peaks, building corners, doorways, or interestingly
shaped patches of snow. These kinds of localized features are
often called keypoint features (or even corners) and are often
described by the appearance of patches of pixels surrounding the
point location.
The features that can be matched based on their orientation and
local appearance (edge profiles) are called edges and they can
also be good indicators of object boundaries and occlusion events
in the image sequence.
Description:
The local appearance around each feature point is described in
some way that is (ideally) invariant under changes in illumination,
translation, scale, and in-plane rotation. We typically end up with
a descriptor vector for each feature point.
Matching:
Descriptors are compared across the images, to identify similar
features. For two images we may get a set of pairs (Xi, Yi) ↔ (Xi`,
Yi`), where (Xi, Yi) is a feature in one image and (Xi`, Yi`) its
matching feature in the other image.
IMAGE RECOGNITION
Recognition is one of the toughest challenges in the concepts in
computer vision. For the human eyes, recognizing an object’s
features or attributes would be very easy. Humans can recognize
multiple objects with very small effort. However, this does not
apply to a machine. It would be very hard for a machine to
recognize or detect an object because these objects vary. They
vary in terms of viewpoints, sizes, or scales.
Object Recognition
Object recognition is used for indicating an object in an image or
video. This is a product of machine learning and deep learning
algorithms. Object recognition tries to acquire this innate human
ability, which is to understand certain features or visual detail of
an image.
The output of object recognition will include the identified object
category along with the probability of correctness.
Object recognition refers to identification of what is present in the
image, while object detection refers to locating where it is present
in the image.
Object recognition through deep learning can be achieved
through training models or through utilizing pre-trained deep
learning models. To train models from scratch, the first thing you
need to do is to collect a large number of datasets. Then you
need to design a certain architecture that will be used for creating
the model.
Just like in deep learning, object recognition through algorithmic
approach is also possible.
The following algorithms are commonly uses approaches:
HOG feature extraction
Bag of words model
Viola-Jones algorithm
IMAGE DETECTION
Image or Object Detection is a technique that processes the
image and detects objects in it.
When it comes to applying deep machine learning to image
detection, developers use Python along with open-source libraries
like OpenCV image detection, Open Detection, Luminoth,
ImageAI, and others. These libraries simplify the learning process
and offer a ready-to-use environment.
The commonly used techniques for Object Detection are
• Haar cascades algorithm
• Viola Jones Algorithm
Object detection uses an object’s feature for classifying its class.
For example, when looking for circles in an image, the machine
will detect any round object. To recognize any instances of an
object in a class, this algorithm uses learning techniques and
extracted features of an image.