Computer Vision Notes
Computer Vision Notes
A) Edge Detection:
➢ Edge detection provides an intrinsically more rigorous means
than thresholding for initiating image segmentation.
➢ Edge detection has long been an alternative path to image
segmentation.
➢ Edge detection has the additional advantage in that it
immediately reduces by a large factor the considerable
redundancy of most image data: this is useful because it
significantly reduces both the space needed to store the
information and the amount of processing subsequently
required to analyze it.
➢ Edge detection has gone through an evolution spanning well
over 30 years.
➢ Two main methods of edge detection have been apparent over
this period.
➢ The first of these being the template matching (TM) approach
and the second being the differential gradient (DG) approach.
➢ In either case the aim is to find where the intensity gradient
magnitude g is sufficiently large to be taken as a reliable
indicator of the edge of an object.
➢ Then g can be thresholded in a similar way in which intensity
has been thresholded
➢ The TM and DG methods differ mainly in how they proceed
to estimate g locally; however, there are also important
differences in how they determine local edge orientation,
which is an important variable in certain object detection
schemes
➢ There are a variety of types of edge, including in particular the
“sudden step” edge, the “slanted step” edge, the “planar”
edge, and various intermediate edge profiles.
➢
➢ (a) sudden step edge; (b) slanted step edge; (c) smooth step
edge; (d) planar edge; (e) roof edge; and (f) line edge
Operators:
Template Matching Operators:
➢ In the TM case, it is usual to employ up to 12 convolution
masks capable of estimating local components of gradient in
different directions
➢ In the TM approach, the local edge gradient magnitude is
approximated by taking the maximum of the responses for the
component masks:
➢
➢ Where n = 8 or 12
➢ In the TM approach, edge orientation is estimated simply as
that of the mask giving rise to the largest value of gradient.
➢
DIFFERENTIAL GRADIENT OPERATORS:
➢ In the case of the DG type of operator, only two such masks
are required—for the x and y directions.
➢ In the DG approach, the local edge magnitude may be
computed vectorially using the nonlinear transformation:
➢
➢ To save computational effort, it is common practice to
approximate this formula by one of the simpler forms:
➢
➢ In the DG approach, it is estimated vectorially by the more
complex equation:
➢
•
• If a significant minimum is found, it is interpreted as the
required threshold value.
• The assumption being made here is that the peak on the left of
the histogram corresponds to dark objects, and the peak on the
right corresponds to light background.
ADAPTIVE THRESHOLDING:
• The problem that arises when illumination is not sufficiently
uniform may be tackled by permitting the threshold to vary
adaptively (or “dynamically”) over the whole image.
• There are several ways of achieving this:
• One involves modeling the background within the image.
• Another is to work out a local threshold value for each pixel
by examining the range of intensities in its neighborhood.
• A third approach is to split the image into sub images and deal
with them independently.
The Chow and Kaneko Approach:
• 1972, Chow and Kaneko introduced what is widely recognized
as the standard technique for dynamic thresholding:
• The technique performs a thoroughgoing analysis of the
background intensity variation, making few compromises to
save computation.
• In this method, the image is divided into a regular array of
overlapping sub images and individual intensity histograms
are constructed for each one.
• Those that are unimodal are ignored since they are assumed
not to provide any useful information that can help in
modeling the background intensity variation.
• The bimodal distributions are well suited to this task: these are
individually fitted to pairs of Gaussian distributions of
adjustable height and width and the threshold values are
located.
Local Thresholding Methods:
• It involves analyzing intensities in the neighborhood of each
pixel to determine the optimum local thresholding level.
• The Chow and Kaneko histogramming technique would be
repeated at each pixel, but this would significantly increase the
computational load of this already computationally intensive
technique.
• One simple means for achieving this is to take a suitably
computed function of nearby intensity values as the threshold:
often the mean of the local intensity distribution is taken
because this is a simple statistic and gives good results in some
cases.
•
Variance-Based Thresholding:
• The standard approach to thresholding outlined earlier
involved finding the neck of the global image intensity
histogram.
• In these methods, the image intensity histogram is analyzed to
find where it can best be partitioned to optimize criteria based
on ratios of the within-class, between-class, and total variance.
• The simplest approach (Otsu, 1979) is to calculate the
between-class variance.
• First, we assume that the image has a grayscale resolution of
L gray levels.
• The number of pixels with gray level i is written as ni, so the
total number of pixels in the image is N .
•
• Thus, the probability of a pixel having gray level i is:
•
Entropy-Based Thresholding:
• Entropy measures of thresholding are based on the concept of
entropy.
• The entropy statistic is high if a variable is well distributed
over the available range, and low if it is well ordered and
narrowly distributed: specifically.
• Entropy is a measure of disorder, and is zero for a perfectly
ordered system.
• The concept of entropy thresholding is to threshold at an
intensity for which the sum of the entropies of the two intensity
probability distributions thereby separated is maximized.
• To proceed, the intensity probability distribution is again
divided into two classes—those with gray levels up to the
threshold value k and those with gray levels above k.
Maximum Likelihood Thresholding:
• One option is to model the training set data using a known
distribution function such as a Gaussian
• The latter has many advantages, including its accessibility to
relatively straightforward mathematical analysis.
• It is specifiable in terms of two well-known parameters—the
mean and standard deviation—which are easily measured in
practical situations
• for any Gaussian distribution, we have
➢
➢ An early study was carried out by Kaizer (1955).
➢ He examined how many pixels an image has to shift before the
autocorrelation function drops to 1/e of its initial value and
produced a subjective measure of coarseness on this basis.
➢ later showed that autocorrelation is not a satisfactory measure
of coarseness. In addition, autocorrelation is not a very good
discriminator of isotropy in natural textures.
GRAYLEVEL CO-OCCURRENCE MATRICES:
• The GrayLevel co-occurrence matrix approach2 is based on
studies of the statistics of pixel intensity distributions.
• As hinted above with regard to the variance in pixel
intensity values, single-pixel statistics do not provide rich
enough descriptions of textures for practical applications.
• It is natural to consider second order statistics obtained by
considering pairs of pixels in certain spatial relations to
each other.
LAWS’ TEXTURE ENERGY APPROACH:
• In 1979 and 1980 Laws presented his novel texture energy
approach to texture analysis
• This involved the application of simple filters to digital
images.
• The basic filters he used were common Gaussian, edge
detector, and Laplacian-type filters, and were designed to
highlight points of high “texture energy” in the image.
• The Laws’ masks are constructed by convolving together
just three basic 1X3 masks:
•
• A complete set. Similarly, the 1X5 masks obtained by
convolving pairs of these 1X3 masks together form a
complete set:
•
ADE’S EIGENFILTER APPROACH:
• In 1983, Ade investigated the theory underlying the Laws’
approach and developed a revised rationale in terms of
eigen filters
• He took all possible pairs of pixels within a 3X3 window,
and characterized the image intensity data by a 9X9
covariance matrix.
• He then determined the eigenvectors required to
diagonalize this matrix.
• These correspond to filter masks similar to the Laws’
masks, i.e., use of these “eigenfilter” masks produces
images that are principal component images for the given
texture.
• Each eigenvalue gives the part of the variance of the
original image that can be extracted by the corresponding
filter.
• The variances give an exhaustive description of a given
texture in terms of the texture of the images from which the
covariance matrix was originally derived. Clearly, the
filters that give rise to low variances can be taken to be
relatively unimportant for texture recognition.
4) Explain Mathematical Morphology.
A) MATHEMATICAL MORPHOLOGY:
The basis of mathematical morphology is the application of set
operations to images and their operators.
We start by defining a generalized dilation mask as a set of locations
within a 3X3 neighborhood.
When referred to the center of the neighborhood as origin, each of
these locations causes a shift of the image in the direction defined
by the vector from the origin to the location.
When several shifts are prescribed by a mask, the 1 location in the
various shifted images are combined by a set union operation.
Dilation expands objects into the background and is able to
eliminate “salt” noise within an object.
It can also be used to remove cracks in objects that are less than 3
pixels in width.
Erosion shrinks binary picture objects, and has the effect of
removing “pepper” noise.
It also removes thin object “hairs” whose widths are less than 3
pixels.
The simplest example of this type is the identity operation I, which
leaves the image unchanged.
which is a left shift, equivalent to the one Combining the above two
operations into a single mask:
•
• The complete set of eight templates would be generated
by successive 90 rotations of the first two shown.
• As for edge detection templates, the mask coefficients
are made to sum to zero, so that corner detection is
insensitive to absolute changes in light intensity
• Ideally, this set of templates should be able to locate all
corners and to estimate their orientation to within 22.5
2. SECOND-ORDER DERIVATIVE SCHEMES:
• Second-order differential operator approaches have been
used widely for corner detection and to mimic the first-
order operators used for edge detection.
• Corners in grayscale images occur in regions of rapidly
changing intensity levels.
• By this token they are detected by the same operators that
detect edges in images.
• Corner pixels are much rarer than edge pixels.
• To achieve this sort of discriminability it is clearly
necessary to consider local variations in image intensity
up to at least second order.
3. A MEDIAN FILTER-BASED CORNER DETECTOR:
• An entirely different strategy for detecting corners was
developed by Paler.
• It adopts an initially surprising and rather
nonmathematical approach based on the properties of the
median filter.
• The technique involves applying a median filter to the
input image, and then forming another image that is the
difference between the input and the filtered images.
• This difference image contains a set of signals that are
interpreted as local measures of corner strength.
4. THE HARRIS INTEREST POINT OPERATOR:
• An important one that has become extremely widely
used is the Harris operator.
• The Harris operator only takes account of first-order
derivatives of the intensity function.
• The Harris operator is defined very simply, in terms of
the local components of intensity gradient Ix, Iy in an
image.
• The definition requires a window region to be defined
and averages hUi are taken over this whole window. We
start by computing the following matrix: