0% found this document useful (0 votes)
62 views47 pages

Digital Image Processing Overview

The document discusses different topics related to digital image processing including image formation, types of images, color and pixelation, and grayscale images. It explains the process of converting an analog image to digital, including sampling and quantization. It also describes the different types of images like binary, grayscale, and color images.

Uploaded by

MR.SubTrio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views47 pages

Digital Image Processing Overview

The document discusses different topics related to digital image processing including image formation, types of images, color and pixelation, and grayscale images. It explains the process of converting an analog image to digital, including sampling and quantization. It also describes the different types of images like binary, grayscale, and color images.

Uploaded by

MR.SubTrio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

 IMAGE FORMATION
 IMAGE COORDINATE SYSTEM
 IMAGE TYPES
 IMAGE CONVERSATION
 IMAGE ARITHMATIC FUNCTION
Image formation is an analog to digital conversion of an image with the help of 2D Sampling and
Quantization techniques that is done by the capturing devices like cameras. In general, we see a 2D
view of the 3D world.
In the same way, the formation of the analog image took place. It is basically a conversion of the 3D
world that is our analog image to a 2D world that is our Digital image.
Generally, a frame grabber or a digitizer is used for sampling and quantizing the analog signals.
Imaging:
The mapping of a 3D world object into a 2D digital image plane is called imaging. In order to do so,
each point on the 3D object must correspond to the image plane. We all know that light reflects
from every object that we see thus enabling us to capture all those light-reflecting points in our
image plane.
Various factors determine the quality of the image like spatial factors or the lens of the capturing
device.

Color and Pixilation:


In digital imaging, a frame grabber is placed at the image plane which is like a sensor. It aims to
focus the light on it and the continuous image is pixelated via the reflected light by the 3D object.
The light that is focused on the sensor generates an electronic signal.
Each pixel that is formed may be colored or grey depending on the intensity of the sampling and
quantization of the light that is reflected and the electronic signal that is generated via them.
All these pixels form a digital image. The density of these pixels determines the image quality. The
more the density the more the clear and high-resolution image we will get.
Forming a Digital Image:
In order to form or create an image that is digital in nature, we need to have a continuous
conversion of data into a digital form. Thus, we require two main steps to do so:

1
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

 Sampling (2D): Sampling is a spatial resolution of the digital image. And the rate of sampling
determines the quality of the digitized image. The magnitude of the sampled image is
determined as a value in image processing. It is related to the coordinates values of the image.
 Quantization: Quantization is the number of grey levels in the digital image. The transition of
the continuous values from the image function to its digital equivalent is called quantization. It is
related to the intensity values of the image.
 The normal human being acquires a high level of quantization levels to get the fine shading
details of the image. The more quantization levels will result in the more clear image.

 IMAGE COORDINATE SYSTEM


What image coordinate systems does the Wolfram Language use?
There are two common coordinate systems used in the field of image processing and analysis: the matrix
coordinate system and the graphics coordinate system. They differ in how the locations of pixels are
expressed.
Matrix or index coordinates
The matrix coordinate space reflects the data matrix stored in the image. The matrix is stored in rows of
columns of pixel values. Therefore, pixel locations are the same as their corresponding row and column in the
data matrix, where rows run from top to bottom and columns run from left to right.
Pixels are therefore identified by integer coordinates, row number first. Coordinates {5,6} identify the red pixel
in the image below. The x direction is down, and the y direction is across.

2
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

In the Wolfram Language, functions that operate on both images and data arrays adhere to the matrix
coordinate system. Examples include GaussianFilter , ImageConvolve and ImageData .
Image or graphics coordinates
The origin of the image or graphics coordinate system is the bottom-left corner of an image. The x coordinate
extends from left to right; the y coordinate extends upward.
Integers give the edges of pixels, like grid lines on graph paper.
An image object with dimensions of {width, height} is placed in the 2D space so that in a position {x,y} given in
the standard image coordinate system, x runs continuously from 0 to width and y from 0 to height. The
position {0,0} corresponds to the bottom-left corner. In the smiley image, the pixel that was at exactly {5,6} in
matrix coordinates is now centered at {5.5,11.5} in image coordinates.

Image pixels are covered by intervals between successive integer coordinate values. Thus, noninteger
coordinates are inside a single pixel. Integer coordinates are located on pixel boundaries, however, and take
all immediate pixel neighbors into account, either by selecting all neighboring pixels or by taking their average
color value.
Wolfram Language functions that focus on images adhere to image coordinates. Examples
include ImageValue, ImageCrop and ImageDimensions .

Types of Images
There are three types of images. They are as following:

1. Binary Images
It is the simplest type of image. It takes only two values i.e, Black and White or 0 and 1. The binary image
consists of a 1-bit image and it takes only 1 binary digit to represent a pixel. Binary images are mostly used for
general shape or outline.
For Example: Optical Character Recognition (OCR).
Binary images are generated using threshold operation. When a pixel is above the threshold value, then it is
turned white('1') and which are below the threshold value then they are turned black('0')
ADVERTISEMENT
ADVERTISEMENT

3
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

2. Gray-scale images
Grayscale images are monochrome images, Means they have only one color. Grayscale images do not contain
any information about color. Each pixel determines available different grey levels.
A normal grayscale image contains 8 bits/pixel data, which has 256 different grey levels. In medical images and
astronomy, 12 or 16 bits/pixel images are used.

3. Color images
Color images are three band monochrome images in which, each band contains a different color and the
actual information is stored in the digital image. The color images contain gray level information in each
spectral band.
The images are represented as red, green and blue (RGB images). And each color image has 24 bits/pixel
means 8 bits for each of the three color band(RGB).

8-bit color format


8-bit color is used for storing image information in a computer's memory or in a file of an image. In this
format, each pixel represents one 8 bit byte. It has 0-255 range of colors, in which 0 is used for black, 255 for
white and 127 for gray color. The 8-bit color format is also known as a gray scale image. Initially, it was used by
the UNIX operating system.
4
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

16-bit color format


The 16-bit color format is also known as high color format. It has 65,536 different color shades. It is used in the
system developed by Microsoft. The 16-bit color format is further divided into three formats which are Red,
Green, and Blue also known as RGB format.
In RGB format, there are 5 bits for R, 6 bits for G, and 5 bits for B. One additional bit is added in gree n because
in all the 3 colors green color is soothing to eyes.

24-bit color format


The 24-bit color format is also known as the true color format. The 24-bit color format is also distributed in
Red, Green, and Blue. As 24 can be equally divided on 8, so it is distributed equally between 3 different colors
like 8 bits for R, 8 bits for G and 8 bits for B.

What is gray scale in image processing?


It is an image conversion technique in digital photography. It eliminates every form of color information and
only leaves different shades of gray; the brightest being white and the darkest of it being black. Its
intermediate shades usually have an equal level of brightness for the primary colors (red, blue and green).
Alternatively it uses equal amounts of cyan, yellow and magenta which are the primary pigments. Each pixel is
a representation of the luminous intensity of the image.

What are the categories of gray scale?


Hue By random action, they alter the color channels of an imputed image-making the system to consider
alternate color shades for the object. It causes edges and shapes of objects to be noticed rather than their
distinct colors.
5
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

Saturation
It involves adjusting the vibrancy of the pixel and that also of the image color.
Brightness
It varies from a scale of 0-100. The lower the level the darker the shade and the higher level the higher the
shade. The shades are between black and white.

What is image processing?


It is the art of converting an image into a digital format in a way that is either manipula ted or enhanced for
data extraction.

What are the categories of image processing?


Image segmentation
It is the digital technique of dividing or partitioning an image into various parts or regions taking into account
the image's pixels. An example of this is object detection.

Computer Vision
It is a field in artificial intelligence which trains computers in a way to make them understand and draw
meaningful data from digital images and videos. One example of how computer vision works is in road lane
detection.

Image analysis
Image analysis is the processing of an image into components to draw useful information from them. Included
in this are objects counting and finding shapes.

Face detection
Face detection is a digital application which identifies facial features in images. An example is seen in the
Google photos app where photos of the same people are automated into individual albums.

How is an image converted into gray scale?


The average pixel values (ranging from 0-255) of the primary colors which are red green and blue (popularly
referred to as RGB) are combined. The luminous intensity of each color band (which is 24 bits) is combined
into a reasonable approximated gray scale value (8 bits)

Why is gray scale needed for image processing?


It helps in simplifying algorithms and as well eliminates the complexities related to computational
requirements.
It makes room for easier learning for those who are new to image processing. This is because gray scale
compressors an image to its barest minimum pixel.
It enhances easy visualization. It differentiates between the shadow details and the highlights of an image
because it is mainly in 2 spatial dimensions (2D) rather than 3D.
Color complexity is also reduced. A typical 3D image requires camera calibration on brig htness among others.
The gray scale conversion option is very useful for captured images which do not need to match colored detail.

Best use case of gray scale in image processing


6
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

Image segmentation and object detection in the medical field.


Gray scale conversion has been used in medical practice for computer aided diagnosis. It is very crucial as the
images from ultrasound X-Ray and computer tomography (CT) scans rely heavily on them to give the right
advice and treatment.
Image segmentation is vital here as the various organs and tissues of the human body have different values in
grayscale. Images captured by medical staff are segmented to differentiate the various anatomical structures.
That way the unique features and defects of each organ or tissue is eas ily identified.
The 3D technology creates bounding boxes which makes objects easily detected. The 2D object detection also
highlights areas of specific medical interest.

Pixel Addition

Common Names: Pixel Add, Sum, Offset

Brief Description
In its most straightforward implementation, this operator takes as input two identically sized
images and produces as output a third image of the same size as the first two, in which each pixel
value is the sum of the values of the corresponding pixel from each of the two input images. More
sophisticated versions allow more than two images to be combined with a single operation.
A common variant of the operator simply allows a specified constant to be added to every pixel.

How It Works
The addition of two images is performed straightforwardly in a single pass. The output pixel values
are given by:

Or if it is simply desired to add a constant value C to a single image then:

If the pixel values in the input images are actually vectors rather than scalar values (e.g. for color
images) then the individual components (e.g. red, blue and green components) are simply added
separately to produce the output value.
If the image format being used only supports, say 8-bit integer pixel values, then it is very easy for
the result of the addition to be greater than the maximum allowed pixel value. The effect of this
depends upon the particular implementation. The overflowing pixel values might just be set to the
maximum allowed value, an effect known as saturation. Alternatively the pixel values might wrap
around from zero again. If the image format supports pixel values with a much larger
range, e.g. 32-bit integers or floating point numbers, then this problem does not occur so much.
Guidelines for Use Image addition crops up most commonly as a sub-step in some more
complicated process rather than as a useful operator in its own right. As an example we show how

7
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

addition can be used to overlay the output from an edge detector on top of the original image
after suitable masking has been carried out.
The image

shows a simple flat dark object against a light background. Applying the Canny edge detector to
this image, we obtain

Suppose that our task is to overlay this edge data on top of the original image. The image

is the result of straightforwardly adding the two images. Since the sum of the edge pixels and the
underlying values in the original is greater than the maximum possible pixel value, these pixels are
(in this implementation) wrapped around. Therefore these pixels have a rather low pixel value and
it is hard to distinguish them from the surrounding pixels. In order to avoid the pixel overflow we
need to replace pixels in the original image with the corresponding edge data pixels, at every
place where the edge data pixels are non-zero. The way to do this is to mask off a region of the
original image before we do any addition.
The mask is made by thresholding the edge data at a pixel value of 128 in order to produce

This mask is then inverted and subsequently ANDed with the original image to produce

Finally, the masked image is added to the unthresholded edge data to produce

8
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

This image now clearly shows that the Canny edge detector has done an extremely good job of
localizing the edges of the original object accurately. It also shows how the response of the edge
detector drops off at the fuzzier left hand edge of the object.
Other uses of addition include adding a constant offset to all pixels in an image so as to brighten
that image. For example, adding a constant value of 50 to

Yields

It is important to realize that if the input images are already quite bright, then straight addition
may produce a pixel value overflow. Image

shows the results of adding 100 to the above image. Most of the background pixels are greater
than the possible maximum (255) and therefore are (with this implementation of
addition) wrapped around from zero. If we implement the operator in such a way that pixel values
exceeding the maximum value are set to 255 (i.e. using a hard limit) we obtain

This image looks more natural than the wrapped around one. However, due to the saturation, we
lose a certain amount of information, since all the values exceeding the maximum value are set to
the same graylevel.
In this case, the pixel values should be scaled down before addition. The image

9
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

is the result of scaling the original with 0.8 and adding a constant value of 100. Although the
image is brighter than the original, it has lost contrast due to the scaling. In most cases, scaling the
image with a factor larger than 1 without using addition at all provides a better way to brighten an
image, as it increases the image contrast. For comparison,

is the original image multiplied with 1.3.


Blending provides a slightly more sophisticated way of merging two images which ensures that
saturation cannot happen.
When adding color images it is important to consider how the color information has been
encoded. The section on 8-bit color images describes the issues to be aware of when adding such
images.

Interactive Experimentation
You can interactively experiment with this operator by clicking here.

Exercises
1. Add the above Canny edge image to its original, using different implementation's of pixel
addition which handle the pixel overflow in different ways. Which one yields the best
results for this implementation?
2. Use skeletonization to produce a skeleton of

Add the skeleton to the original. Which problems do you face and how might they be
solved?
3. Add a constant value of 255 to

Use two different implementations, one wrapping around from zero all pixel values
exceeding the maximum value and one using a hard limit of 255. Comment on the results.

Pixel Subtraction

Common Names: Pixel difference, Pixel subtract


10
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

The pixel subtraction operator takes two images as input and produces as output a third image
whose pixel values are simply those of the first image minus the corresponding pixel values from
the second image. It is also often possible to just use a single image as input and subtract a
constant value from all the pixels. Some versions of the operator will just output the absolute
difference between pixel values, rather than the straightforward signed output.

How It Works
The subtraction of two images is performed straightforwardly in a single pass. The output pixel
values are given by:

Or if the operator computes absolute differences between the two input images then:

Or if it is simply desired to subtract a constant value C from a single image then:

If the pixel values in the input images are actually vectors rather than scalar values (e.g. for color
images) then the individual components (e.g. red, blue and green components) are simply
subtracted separately to produce the output value.
Implementations of the operator vary as to what they do if the output pixel values are negative.
Some work with image formats that support negatively-valued pixels, in which case the negative
values are fine (and the way in which they are displayed will be determined by the
display colormap). If the image format does not support negative numbers then often such pixels
are just set to zero (i.e. black typically). Alternatively, the operator may `wrap' negative values, so
that for instance -30 appears in the output as 226 (assuming 8-bit pixel values).
If the operator calculates absolute differences and the two input images use the same pixel value
type, then it is impossible for the output pixel values to be outside the range that may be
represented by the input pixel type and so this problem does not arise. This is one good reason for
using absolute differences.

Guidelines for Use


Image subtraction is used both as a sub-step in complicated image processing sequences, and also
as an important operator in its own right.
A common use is to subtract background variations in illumination from a scene so that the
foreground objects in it may be more easily analyzed. For instance,

11
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

shows some text which has been badly illuminated during capture so that there is a strong
illumination gradient across the image. If we wish to separate out the foreground text from the
background page, then the obvious method for black on white text is simply to threshold the
image on the basis of intensity. However, simple thresholding fails here due to the illumination
gradient. A typical failed attempt looks like

Now it may be that we cannot adjust the illumination, but we can put different things in the
scene. This is often the case with microscope imaging, for instance. So we replace the text with a
sheet of white paper and without changing anything else we capture a new image, as shown in

This image is the lightfield. Now we can subtract the lightfield image from the original image to
attempt to eliminate variation in the background intensity. Before doing that an offset of 100
is added to the first image to in order avoid getting negative numbers and we also use 32-bit
integer pixel values to avoid overflow problems. The result of the subtraction is shown in

Note that the background intensity of the image is much more uniform than before, although the
contrast in the lower part of the image is still poor. Straightforward thresholding can now achieve
better results than before, as shown in

which is the result of thresholding at a pixel value of 80. Note that the results are still not ideal,
since in the poorly lit areas of the image the contrast (i.e. difference between foreground and
background intensity) is much lower than in the brightly lit areas, making a suitable threshold

12
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

difficult or impossible to find. Compare these results with the example described under pixel
division.
Absolute image differencing is also used for change detection. If the absolute difference between
two frames of a sequence of images is formed, and there is nothing moving in the scene, then the
output will mostly consist of zero value pixels. If however, there is movement going on, then
pixels in regions of the image where the intensity changes spatially, will exhibit significant
absolute differences between the two frames.
As an example of such change detection, consider

which shows an image of a collection of screws and bolts. The image

shows a similar scene with one or two differences. If we calculate the absolute difference
between the frames as shown in

then the regions that have changed become clear. The last image here has been contrast-
stretched in order to improve clarity.
Subtraction can also be used to estimate the temporal derivative of intensity at each point in a
sequence of images. Such information can be used, for instance, in optical flow calculations.
Simple subtraction of a constant from an image can be used to darken an image,
although scaling is normally a better way of doing this.
It is important to think about whether negative output pixel values can occur as a result of the
subtraction, and how the software will treat pixels that do have negative values. An example of
what may happen can be seen in

which is the above lightfield directly subtracted from the text images. In the implementation
of pixel subtraction which was used, negative values are wrapped around starting from the
maximum value. Since we don't have exactly the same reflectance of the paper when taking the
images of the lightfield and the text, the difference of pixels belonging to background is either
13
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

slightly above or slightly below zero. Therefore the wrapping results in background pixels with
either very small or very high values, thus making the image unsuitable for further processing (for
example, thresholding). If we alternatively set all negative values to zero, the image would
become completely black, because subtracting the pixels in the lightfield from the pixels
representing characters in the text image yields negative results, as well.
In this application, a suitable way to deal with negative values is to use absolute differences, as
can be seen in

or as a gamma corrected version in

Thresholding this image yields similar good results as the earlier example.
If negative values are to be avoided then it may be possible to first add an offset to the first input
image. It is also often useful if possible to convert the pixel value type to something with a
sufficiently large range to avoid overflow, e.g. 32-bit integers or floating point numbers.

Interactive Experimentation
You can interactively experiment with this operator by clicking here.

Exercises
1. Take images of your watch at two different times, without moving it in between, and use
subtraction to highlight the difference in the display.
2. Use

to investigate the following method for edge detection. First apply erosion to the image
and then subtract the result from the original. What is the difference in the edge image if
you use dilation instead of erosion? What effects have size and form of the structuring
element on the result. How does the technique perform on grayscale images?

14
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

Pixel Multiplication and Scaling

Common Names: Pixel Multiplication, Graylevel scale

Brief Description
Like other image arithmetic operators, multiplication comes in two main forms. The first form
takes two input images and produces an output image in which the pixel values are just those of
the first image, multiplied by the values of the corresponding values in the second image. The
second form takes a single input image and produces output in which each pixel value is
multiplied by a specified constant. This latter form is probably the more widely used and is
generally called scaling.
This graylevel scaling should not be confused with geometric scaling.

How It Works
The multiplication of two images is performed in the obvious way in a single pass using the
formula:

Scaling by a constant is performed using:

Note that the constant is often a floating point number, and may be less than one, which will
reduce the image intensities. It may even be negative if the image format supports that.
If the pixel values are actually vectors rather than scalar values (e.g. for color images) then the
individual components (e.g. ref{rgb}{red, blue and green components}) are simply multiplied
separately to produce the output value.
If the output values are calculated to be larger than the maximum allowed pixel value, then they
may either be truncated at that maximum value, or they can `wrap around' and continue upwards
from the minimum allowed number again.

Guidelines for Use


There are many specialist uses for scaling. In general though, given a scaling factor greater than
one, scaling will brighten an image. Given a factor less than one, it will darken the image. Scaling
15
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

generally produces a much more natural brightening/darkening effect than simply adding an
offset to the pixels, since it preserves the relative contrast of the image better. For instance,

shows a picture of model robot that was taken under low lighting conditions. Simply scaling every
pixel by a factor of 3, we obtain

which is much clearer. However, when using pixel multiplication, we should make sure that the
calculated pixel values don't exceed the maximum possible value. If we, for example, scale the
above image by a factor of 5 using a 8-bit representation, we obtain

All the pixels which, in the original image, have a value greater than 51 exceed the maximum
value and are (in this implementation) wrapped around from 255 back to 0.
The last example shows that it is important to be aware of what will happen if the multiplications
result in pixel values outside the range that can be represented by the image format being used. It
is also very easy to generate very large numbers with pixel-by-pixel multiplication. If the image
processing software supports it, it is often safest to change to an image format with a large
range, e.g. floating point, before attempting this sort of calculation.
Scaling is also often useful prior to other image arithmetic in order to prevent pixel values going
out of range, or to prevent integer quantization ruining the results (as in integer image division).
Pixel-by-pixel multiplication is generally less useful, although sometimes a binary image can be
used to multiply another image in order to act as a mask. The idea is to multiply by 1 those pixels
that are to be preserved, and multiply by zero those that are not. However for integer format
images it is often easier and faster to use the logical operator AND instead.
Pixel Division

Common Names: Pixel Division, Ratioing

16
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

Brief Description
The image division operator normally takes two images as input and produces a third whose pixel
values are just the pixel values of the first image divided by the corresponding pixel values of the
second image. Many implementations can also be used with just a single input image, in which
case every pixel value in that image is divided by a specified constant.

How It Works
The division of two images is performed in the obvious way in a single pass using the formula:

Division by a constant is performed using:

If the pixel values are actually vectors rather than scalar values (e.g. for color images) than the
individual components (e.g. red, blue and green components) are simply divided separately to
produce the output value.
The division operator may only implement integer division, or it may also be able to handle
floating point division. If only integer division is performed, then results are typically rounded
down to the next lowest integer for output. The ability to use images with pixel value types other
than simply 8-bit integers comes in very handy when doing division.

Guidelines for Use


One of the most important uses of division is in change detection, in a similar way to the use
of subtraction for the same thing. Instead of giving the absolute change for each pixel from one
frame to the next, however, division gives the fractional change or ratio between corresponding
pixel values (hence the common alternative name of ratioing). The images

and

are of the same scene except two objects have been slightly moved between the exposures.
Dividing the former by the latter using a floating point pixel type and then contrast stretching the
resulting image yields

17
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

After the division, pixels which didn't change between the exposures have a value of 1, whereas if
the pixel value increased after the first exposure the result of the division is clustered between 0
and 1, otherwise it is between 1 and 255 (provided the pixel value in the second image is not
smaller than 1). That is the reason why we can only see the new position of the moved part in the
contrast-stretched image. The old position can be visualized by histogram equalizing the division
output, as shown in

Here, high values correspond to the new position, low values correspond to the old position,
assuming that the intensity of the moved object is lower than the background intensity.
Intermediate graylevels in the equalized image correspond to areas of no change. Due to noise,
the image also shows the position of objects which were not moved.
For comparison, the absolute difference between the two images, as shown in

produces approximately the same pixel values at the old and the new position of a moved part.
Blending

Common Names: Blend, Merge

Brief Description
This operator forms a blend of two input images of the same size. Similar to pixel addition, the
value of each pixel in the output image is a linear combination of the corresponding pixel values in
the input images. The coefficients of the linear combination are user-specified and they define the

18
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

ratio by which to scale each image before combining them. These proportions are applied such
that the output pixel values do not exceed the maximum pixel value.

How It Works
The resulting image is calculated using the formula

and are the two input images. In some applications can also be a constant, thus allowing
a constant offset value to be added to a single image.
X is the blending ratio which determines the influence of each input image in the output. X can
either be a constant factor for all pixels in the image or can be determined for each pixel
separately using a mask. The size of the mask must then be identical with the size of the images.
Some implementations only support graylevel images. If multi-spectral images are supported the
calculation is done for each band separately.

Guidelines for Use


Image blending is used for similar applications as image addition with the difference that we don't
have to worry whether the values of the output image exceed the allowed maximum. In most
cases the operator is a part of some more complicated process. As an example we use image
blending to overlay the output of an edge detector on top of the original image, (compare with
the results achieved with image addition).
The image

shows a simple flat dark object against a light background. Applying the Canny edge detector to
this image we obtain

We get

if we apply the blending operator with X = 0.5, where the original image is and the edge image
is . The result clearly shows the disadvantage of image blending over image addition: since each
of the input images is scaled with 0.5 before they are added up, the contrast of each image is
19
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

halved. That is why it is hard to see the difference between the object and the background of the
original image. If the contrast in one image is more important than the other, we can improve the
result by choosing a blending ratio other than 0.5, thus keeping more of the contrast in the image
where it is needed. To get

the same two images as above were blended with X=0.7.


The bad result in the first example is mainly due to the low initial contrast in the input images. So,
we will have a better result if the input images are of high contrast. To produce

the input images were contrast-enhanced with contrast stretching and then blended with X = 0.5.
Although this already yields a better result, we still lose some contrast with respect to the original
input images.
To maintain the full contrast in the output image we can define a special mask. The mask is made
by thresholding the edge image at a pixel value of 128 and setting the non-zero values to one.
Now, we blend the graylevel edge image (now corresponding to ) and the original image using
the thresholded image as a blending mask X(i,j). The image

shows the result, which is identical to the one achieved with image addition, but now achieved via
a slightly simpler process.
Blending can also be used to achieve nice effects in photographs. We obtained

by blending

20
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

with the resized version of

using the X=0.5.

Interactive Experimentation
You can interactively experiment with this operator by clicking here.

Exercises
1. Examine the effects of using blending ratios other than 0.5 when blending

and

2. Take an image and add a constant value (e.g. 100) using image blending and image
addition. Comment on the differences of the results.
3. Produce a skeleton from

using skeletonization. Assess the result by combining the two images using the blending
operator.
Logical AND/NAND

Common Names: AND, NAND

21
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

Brief Description
AND and NAND are examples of logical operators having the truth-tables shown in Figure 1.

Figure 1 Truth-tables for AND and NAND.

As can be seen, the output values of NAND are simply the inverse of the corresponding output
values of AND.
The AND (and similarly the NAND) operator typically takes two binary or integer graylevel
images as input, and outputs a third image whose pixel values are just those of the first image,
ANDed with the corresponding pixels from the second. A variation of this operator takes just a
single input image and ANDs each pixel with a specified constant value in order to produce the
output.

How It Works
The operation is performed straightforwardly in a single pass. It is important that all the input
pixel values being operated on have the same number of bits in them or unexpected things may
happen. Where the pixel values in the input images are not simple 1-bit numbers, the AND
operation is normally (but not always) carried out individually on each corresponding bit in the
pixel values, in bitwise fashion.

Guidelines for Use


The most obvious application of AND is to compute the intersection of two images. We illustrate
this with an example where we want to detect those objects in a scene which did not move
between two images, i.e. which are at the same pixel positions in the first and the second image.
We illustrate this example using

22
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

and

If we simply AND the two graylevel images in a bitwise fashion we obtain

Although we wanted the moved object to disappear from the resulting image, it appears twice, at
its old and at its new position. The reason is that the object has rather low pixel values (similar to
a logical 0) whereas the background has a high values (similar to a logical 1). However, we
normally associate an object with logical 1 and the background with logical 0, therefore we
actually ANDed the negatives of two images, which is equivalent to NOR them. To obtain the
desired result we have to invert the images before ANDing them, as it was done in

Now, only the object which has the same position in both images is highlighted. However, ANDing
two graylevel images might still cause problems, as it is not guaranteed that ANDing two high
pixel values in a bitwise fashion yields a high output value (for example, 128 AND 127 yields 0). To
avoid these problems, it is best to produce a binary versions from the grayscale images
using thresholding.

23
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

and

are the thresholded versions of the above images and

is the result of ANDing their negatives.


Although ANDing worked well for the above example, it runs into problems in a scene like

Here, we have two objects with the average intensity of one being higher than the background
and the other being lower. Hence, we can't produce a binary image containing both objects using
simple thresholding. As can be seen in the following images, ANDing the grayscale images is not
successful either. If in the second scene the light part was moved, as in

then the result of ANDing the two images is

It shows the desired effect of attenuating the moved object. However, if the second scene is
somehow like

24
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

where the dark object was moved, we obtain

Here, the old and the new positions of the dark object are visible.
In general, applying the AND operator (or other logical operators) to two images in order to detect
differences or similarities between them is most appropriate if they are binary or can be
converted into binary format using thresholding.
As with other logical operators, AND and NAND are often used as sub-components of more
complex image processing tasks. One of the common uses for AND is for masking. For example,
suppose we wish to selectively brighten a small region of

to highlight a particular car. There are many ways of doing this and we illustrate just one. First
a paint program is used to identify the region to be highlighted. In this case we set the region to
black as shown in

This image can then be thresholded to just select the black region, producing the mask shown in

25
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

The mask image has a pixel value of 255 (11111111 binary) in the region that we are interested in,
and zero pixels (00000000 binary) elsewhere. This mask is then bitwise ANDed with the original
image to just select out the region that will be highlighted. This produces

Finally, we brighten this image by scaling it by a factor of 1.1, dim the original image using a scale
factor of 0.8, and then add the two images together to produce

AND can also be used to perform so called bit-slicing on an 8-bit image. To determine the
influence of one particular bit on an image, it is ANDed in a bitwise fashion with a constant
number, where the relevant bit is set to 1 and the remaining 7 bits are set to 0. For example, to
obtain the bit-plane 8 (corresponding to the most significant bit) of

we AND the image with 128 (10000000 binary) and threshold the output at a pixel value of 1. The
result, shown in

26
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

is equivalent to thresholding the image at a value of 128. Images

and

correspond to bit-planes 7, 6 and 4. The images show that most image information is contained in
the higher (more significant) bits, whereas the less significant bits contain some of the finer details
and noise. The image

shows bit-plane 1.

Interactive Experimentation
You can interactively experiment with this operator by clicking here.

Exercises
1. NAND

27
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

and

Compare the result with the result of ANDing the negatives of the two input images.
2. AND

and

as well as the negatives of

and

Compare the results with the ones obtained in the previous section.

28
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

3. Extract all 8 bit planes from

and

Comment on the number of visually significant bits in each image.


4. What would be the effect of ANDing an 8-bit graylevel image with a constant value of 240
(11110000 in binary)? Why might you want to do this?
5. What would be the effect of ANDing an 8-bit graylevel image with a constant value of 15
(00001111 in binary)? Why might you want to do this? Try this out on

and comment on what you see.

Logical OR/NOR

Common Names: OR, NOR

Brief Description
OR and NOR are examples of logical operators having the truth-tables shown in Figure 1.

29
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

Figure 1 Truth-tables for OR and NOR.

As can be seen, the output values of NOR are simply the inverses of the corresponding output
values of OR.
The OR (and similarly the NOR) operator typically takes two binary or graylevel images as input,
and outputs a third image whose pixel values are just those of the first image, ORed with the
corresponding pixels from the second. A variation of this operator takes just a single input image
and ORs each pixel with a specified constant value in order to produce the output.

How It Works
The operation is performed straightforwardly in a single pass. It is important that all the input
pixel values being operated on have the same number of bits in them or unexpected things may
happen. Where the pixel values in the input images are not simple 1-bit numbers, the OR
operation is normally (but not always) carried out individually on each corresponding bit in the
pixel values, in bitwise fashion.

Guidelines for Use


We can illustrate the function of the OR operator using

and

30
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

The images show a scene with two objects, one of which was moved between the exposures. We
can use OR to compute the union of the images, i.e. highlighting all pixels which represent an
object either in the first or in the second image. First, we threshold the images, since the process
is simplified by use binary input. If we OR the resulting images

and

we obtain

This image shows only the position of the object which was at the same location in both input
images. The reason is that the objects are represented with logically 0 and the background is
logically 1. Hence, we actually OR the background which is equivalent to NANDing the objects. To
get the desired result, we first have to invert the input images before ORing them. Then, we
obtain

Now, the output shows the position of the stationary object as well as that of the moved object.

31
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

As with other logical operators, OR and NOR are often used as sub-components of more complex
image processing tasks. OR is often used to merge two images together. Suppose we want to
overlay

with its histogram, shown in

First, an image editor is used to enlarge the histogram image until it is the same size as the
grayscale image as shown in

Then, simply ORing the two gives

The performance in this example is quite good, because the images contain very distinct
graylevels. If we proceed in the same way with

we obtain

Now, it is difficult to see the characters of the histogram (which have high pixel values) at places
where the original image has high values, as well. Compare the result with that described
under XOR.
32
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

Note that there is no problem of overflowing pixel values with the OR operator, as there is with
the addition operator.
ORing is usually safest when at least one of the images is binary, i.e. the pixel values are 0000...
and 1111... only. The problem with ORing other combinations of integers is that the output result
can fluctuate wildly with a small change in input values. For instance 127 ORed with 128 gives 255,
whereas 127 ORed with 126 gives 127.

Interactive Experimentation
You can interactively experiment with this operator by clicking here.

Exercises
1. NOR

and

and AND their negatives. Compare the results.


2. Why can't you use thresholding to produce a binary image containing both objects of

and

33
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

? Use graylevel ORing to combine the two images. Can you detect all the locations of the
objects in the two images? What changes if you invert the images before combining them.
3. In the example above, how could you make the histogram appear in black instead of white?
Try it.
4. Summarize the conditions under which you would use OR to combine two images rather
than, say, addition or blending.

Logical XOR/XNOR

Common Names: XOR, XNOR, EOR, ENOR

Brief Description
XOR and XNOR are examples of logical operators having the truth-tables shown in Figure 1.

Figure 1 Truth-tables for XOR and XNOR.

The XOR function is only true if just one (and only one) of the input values is true, and false
otherwise. XOR stands for eXclusive OR. As can be seen, the output values of XNOR are simply the
inverse of the corresponding output values of XOR.
The XOR (and similarly the XNOR) operator typically takes two binary or graylevel images as input,
and outputs a third image whose pixel values are just those of the first image, XORed with the

34
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

corresponding pixels from the second. A variation of this operator takes a single input image and
XORs each pixel with a specified constant value in order to produce the output.

How It Works
The operation is performed straightforwardly in a single pass. It is important that all the input
pixel values being operated on have the same number of bits in them, or unexpected things may
happen. Where the pixel values in the input images are not simple 1-bit numbers, the XOR
operation is normally (but not always) carried out individually on each corresponding bit in the
pixel values, in bitwise fashion.

Guidelines for Use


We illustrate the function of XOR using

and

Since logical operators work more reliably with binary input we first threshold the two images,
thus obtaining

and

35
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

Now, we can use XOR to detect changes in the images, since pixels which didn't change output 0
and pixels which did change result in 1. The image

shows the result of XORing the thresholded images. We can see the old and the new position of
the moved object, whereas the stationary object almost disappeared from the image. Due to the
effects of noise, we can still see some pixels around the boundary of the stationary
object, i.e. pixels whose values in the original image were close to the threshold.
In a scene like

it is not possible to apply a threshold in order to obtain a binary image, since one of the objects is
lighter than the background whereas the other one is darker. However, we can combine two
grayscale images by XORing them in a bitwise fashion.

shows a scene where the dark object was moved and in

the light object changed its position. XORing each of them with the initial image yields

and

36
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

respectively. In both cases, the moved part appears at the old as well as at the new location and
the stationary object almost disappears. This technique is based on the assumption that XORing
two similar grayvalues produces a low output, whereas two distinct inputs yield a high output.
However, this is not always true, e.g. XORing 127 and 128 yields 255. These effects can be seen at
the boundary of the stationary object, where the pixels have an intermediate graylevel and might,
due to noise, differ slightly between two of the images. Hence, we can see a line with high values
around the stationary object. A similar problem is that the output for the moved pen is much
higher than the output for the moved piece of paper, although the contrast between their
intensities and that of the background value is roughly the same. Because of these problems it is
often better to use image subtraction or image division for change detection.
As with other logical operators, XOR and XNOR are often used as sub-components of more
complex image processing tasks. XOR has the interesting property that if we XOR A with B to
get Q, then the bits of Q are the same as A where the corresponding bit from B is zero, but they
are of the opposite value where the corresponding bit from B is one. So for instance using binary
notation, 1010 XORed with 1100 gives 0110. For this reason, B could be thought of as a bit-
reversal mask. Since the operator is symmetric, we could just as well have treated A as the mask
and B as the original.
Extending this idea to images, it is common to see an 8-bit XOR image mask containing only the
pixel values 0 (00000000 binary) and 255 (11111111 binary). When this is XORed pixel-by-pixel
with an original image it reverses the bits of pixels values where the mask is 255, and leaves them
as they are where the mask is zero. The pixels with reversed bits normally `stand out' against their
original color and so this technique is often used to produce a cursor that is visible against an
arbitrary colored background. The other advantage of using XOR like this is that to undo the
process (for instance when the cursor moves away), it is only necessary to repeat the XOR using
the same mask and all the flipped pixels will become unflipped. Therefore it is not necessary to
explicitly store the original colors of the pixels affected by the mask. Note that the flipped pixels
are not always visible against their unflipped color --- light pixels become dark pixels and dark
pixels become light pixels, but middling gray pixels become middling gray pixels!
The image

shows a simple graylevel image. Suppose that we wish to overlay this image with
its histogram shown in

37
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

so that the two can be compared easily. One way is to use XOR. We first use an image editor to
enlarge the histogram until it is the same size as the first image. The result is shown in

To perform the overlay we simply XOR this image with the first image in bitwise fashion to
produce

Here, the text is quite easy to read, because the original image consists of large and rather light or
rather dark areas. If we proceed in the same way with

we obtain

Note how the writing is dark against light backgrounds and light against dark backgrounds and
hardly visible against gray backgrounds. Compare the result with that described under OR. In fact
XORing is not particularly good for producing easy to read text on gray backgrounds --- we might
do better just to add a constant offset to the image pixels that we wish to highlight (assuming
wraparound under addition overflow) --- but it is often used to quickly produce highlighted pixels
where the background is just black and white or where legibility is not too important.

Interactive Experimentation
You can interactively experiment with this operator by clicking here.

Exercises
38
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

1. XOR

and

Compare the result with the output of XORing their negatives. Do you see the same effect
as for other logical operators?
2. Use the technique discussed above to produce a cursor on

Place the cursor on different location of the image and examine the performance on a
background with high, low, intermediate and mixed pixel values.

nvert/Logical NOT

Common Names: Logical NOT, invert, photographic negative

Brief Description
Logical NOT or invert is an operator which takes a binary or graylevel image as input and produces
its photographic negative, i.e. dark areas in the input image become light and light areas become
dark.

How It Works
39
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

To produce the photographic negative of a binary image we can employ the logical NOT operator.
Its truth-table is shown in Figure 1.

Figure 1 Truth-table for logical NOT.

Each pixel in the input image having a logical 1 (often referred to as foreground) has a logical 0
(associated with the background in the output image and vice versa. Hence, applying logical NOT
to a binary image changes its polarity.
The logical NOT can also be used for a graylevel image being stored in byte pixel format by
applying it in a bitwise fashion. The resulting value for each pixel is the input value subtracted
from 255:

Some applications of invert also support integer or float pixel format. In this case, we can't use the
logical NOT operator, therefore the pixel values of the inverted image are simply given by

If this output image is normalized for an 8-bit display, we again obtain the photographic negative
of the original input image.

Guidelines for Use


When processing a binary image with a logical or morphological operator, its polarity is often
important. Hence, the logical NOT operator is often used to change the polarity of a binary image
as a part of some larger process. For example, if we OR

and

40
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

the resulting image,

shows the union of the background, because it is represented with a logical 1. However, if we OR

and

which are the inverted versions of the above image we obtain

Now, the result contains the union of the two circles.


We illustrate another example of the importance of the polarity of a binary image using
the dilation operator. Dilation expands all white areas in a binary image. Hence, if we dilate

the object, being represented with a logical 1, grows and the holes in the object shrink. We obtain

41
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

If we dilate

which was obtained by applying logical NOT to the original image, we get

Here, the background is expanded and the object became smaller.


Invert can be used for the same purpose on grayscale images, if they are processed with a
morphological or logical operator.
Invert is also used to print the photographic negative of an image or to make the features in an
image appear clearer to a human observer. This can, for example, be useful for medical images,
where the objects often appear in black on a white background. Inverting the image makes the
objects appear in white on a dark background, which is often more suitable for the human eye.
From the original image

of a tissue slice, we obtain the photographic negative

Interactive Experimentation
You can interactively experiment with this operator by clicking here.

Exercises
42
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

1. Apply the erode operator to

and

Which polarity of the image allows you to suppress the circles?


2. Compare the results of ORing

and

and ORing their photographic negatives.


3. Take the photographic negative of

Does it improve the visibility of the features in the image?


Bitshift Operators

Common Names: Bitshifting

43
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

Brief Description
The bitshift operator works on images represented in byte or integer pixel format, where each
pixel value is stored as a binary number with a fixed amount of bits. Bitshifting shifts the binary
representation of each pixel to the left or to the right by a pre-defined number of positions.
Shifting a binary number by one bit is equivalent to multiplying (when shifting to the left)
or dividing (when shifting to the right) the number by 2.

How It Works
The operation is performed straightforwardly in a single pass. If the binary representation of a
number is shifted in one direction, we obtain an empty position on the opposite side. There are
generally three possibilities of how to fill in this empty position: we can pad the empty bits with
a 0 or a 1 or we can wrap around the bits which are shifted out of the binary representation of the
number on the other side. The last possibility is equivalent to rotating the binary number.
The choice of technique used depends on the implementation of the operator and on the
application. In most cases, bitshifting is used to implement a fast multiplication or division. In
order to obtain the right results for this application, we have to pad the empty bits with a 0. Only
in the case of dividing a negative number by a power of 2, do we need to fill the left bits with a 1,
because a negative number is represented as the two's-complement of the positive
number, i.e. the sign bit is a 1. The result of applying bitshifting in this way is illustrated in the
following formula:

An example is shown in Figure 1.

44
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

Figure 1 Examples for using bitshifting for multiplication and division. Note that the bottom
example uses a signed-byte convention where a byte represents a number between -128 and
+127

If bitshifting is used for multiplication, it might happen that the result exceeds the maximum
possible pixel value. This is the case when a 1 is shifted out of the binary representation of the
pixel value. This information is lost and the effect is that the value is wrapped around from zero.

Guidelines for Use


The main application for the bitshift operator is to divide or multiply an image by a power of 2.
The advantage over the normal pixel division and pixel multiplication operators is that bitshifting
is computationally less expensive.
For example, if we want to add two images we can use bitshifting to make sure that the result will
not exceed the maximum pixel value. We illustrate this example using

and

45
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

where the latter is the skeleton gained from the thresholded version of the former. To better
visualize the result of the skeletonization we might want to overlay these two images. However, if
we add them straightforwardly we obtain pixel values greater than the maximum value. First
shifting both images to the right by one bit yields

and

which then can be added without causing any overflow problems. The result can be seen in

Here, we can see that shifting a pixel to the right does, as a normal pixel division, decrease the
contrast in the image.
On the other hand, shifting the binary representation of a pixel to the left increases the image
contrast, like the pixel multiplication. For example,

is an image taken under poor lighting conditions. Shifting each pixel in the image to the left by one
bit, which is identical to multiplying it with 2, yields

Although the operator worked well in this example, we have to be aware that the result of the
multiplication might exceed the maximum pixel value. Then, the effect for the pixel value is that it
is wrapped around from 0. For example, if we shift each pixel in the above image by two bits, at
some pixels a 1 is shifted out of the binary representation of the image, resulting in a loss of
information. This can be seen in

In general, we should make sure that the values in the input image are sufficiently small or we
have to be careful when we interpret the resulting image. Alternatively, we can change the pixel
value format prior to applying the bitshift operator, e.g. change from byte format
to integer format.
46
Rizwan Notes (Digital Image Processing-CS-504-UAF-3(2-1))

Although multiplication and division are the main applications for bitshifting it might also be used
for other, often very specialized, purposes. For example, we can store two 4-bit images in a byte
array if we shift one of the two images by 4 bits and mask out the unused bits. Using the logical
OR operator we can combine the two images into one without losing any information. Sometimes
it might also be useful to rotate the binary representation of each bit, apply some other operator
to the image and finally rotate the pixels back to the initial order.

Interactive Experimentation
You can interactively experiment with this operator by clicking here.

Exercises
1. Use pixel addition to overlay

and its edge image

Apply the bitshift operator to the original image in order to increase its contrast. Convert
the image into an integer format prior to the shifting to preserve all image information.
Compare the result of the addition with the one you get without the bitshifting.
2. What is the result of dividing -7 (binary:1001) by 2 using bitshifting. What is the result of
dividing +7 (binary:0111) by 2?

47

You might also like