0% found this document useful (0 votes)
106 views5 pages

Digital Image Processing Concepts Explained

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views5 pages

Digital Image Processing Concepts Explained

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

been subjected to high intensity red light before viewing the flag. The color of the flag 1.

1. The human being can resolve Several rods are connected to


SUBJECT NAME : DIGITAL IMAGE PROCESSING (EC5A1) will appear to shift in hue toward the red component cyan. the fine details with these cones one nerve end. So it gives the
because each one is connected to overall picture of the image.
UNIT I 11. Define Resolutions its own nerve end.
Resolution is defined as the smallest number of discernible detail in an image. 2. This is also known as bright This is also known as thin light
PART- A Spatial resolution is the smallest discernible detail in an image and gray level resolution light vision. vision.
refers to the smallest discernible change is gray level.
1. Define Image 20. How cones and rods are distributed in retina?
An image may be defined as two dimensional light intensity function f(x, y) where x and 12. What is meant by pixel? In each eye, cones are in the range 6-7 million and rods are in the range 75-150
y denote spatial co-ordinate and the amplitude or value of f at any point (x, y) is called A digital image is composed of a finite number of elements each of which has a particular million.
intensity or grayscale or brightness of the image at that point. location or value. These elements are referred to as pixels or image elements or picture
elements or pels elements. 21. Define subjective brightness and brightness adaptation
2. What is Dynamic Range? Subjective brightness means intensity as preserved by the human visual system.
The range of values spanned by the gray scale is called dynamic range of an 13. Define Digital image Brightness adaptation means the human visual system can operate only from
image. Image will have high contrast, if the dynamic range is high and image will have When x, y and the amplitude values of f all are finite discrete quantities , we call scotopic to glare limit. It cannot operate over the range simultaneously. It accomplishes
dull washed out gray look if the dynamic range is low. the image digital image. this large variation by changes in its overall intensity.
DIGITAL IMAGE PROCESSING
(2marks and 16 marks questions and answers) 3. Define Brightness
Brightness of an object is the perceived luminance of the surround. Two objects
14. What are the steps involved in DIP?
1. Image Acquisition
22. Define weber ratio
The ratio of increment of illumination to background of illumination is called as weber
with different surroundings would have identical luminance but different brightness. 2. Preprocessing ratio.(ie) Δi/i
3. Segmentation If the ratio (Δi/i) is small, then small percentage of change in intensity is needed (ie) good
5. What do you meant by Gray level? 4. Representation and Description brightness adaptation.
Gray level refers to a scalar measure of intensity that ranges from black to grays 5. Recognition and Interpretation If the ratio (Δi/i) is large , then large percentage of change in intensity is needed (ie) poor
and finally to white. brightness adaptation.
15. What is recognition and Interpretation?
6. What do you meant by Color model? Recognition means is a process that assigns a label to an object based on the 23. What is meant by machband effect?
A Color model is a specification of 3D-coordinates system and a subspace within information provided by its descriptors. Machband effect means the intensity of the stripes is constant. Therefore it
that system where each color is represented by a single point. Interpretation means assigning meaning to a recognized object. preserves the brightness pattern near the boundaries, these bands are called as machband
effect.
7. List the hardware oriented color models 16. Specify the elements of DIP system
1. RGB model 1. Image Acquisition 24. What is simultaneous contrast?
2. CMY model 2. Storage The region reserved brightness not depend on its intensity but also on its
3. YIQ model 3. Processing background. All centre square have same intensity. However they appear to the eye to
4. HSI model 4. Display become darker as the background becomes lighter.

8. What is Hue and saturation? 17. List the categories of digital storage 25. What is meant by illumination and reflectance?
Hue is a color attribute that describes a pure color where saturation gives a 1. Short term storage for use during processing. Illumination is the amount of source light incident on the scene. It is represented
measure of the degree to which a pure color is diluted by white light. 2. Online storage for relatively fast recall. as i(x, y).
3. Archival storage for infrequent access. Reflectance is the amount of light reflected by the object in the scene. It is
9. List the applications of color models 18. What are the types of light receptors? represented by r(x, y).
1. RGB model--- used for color monitors & color video camera The two types of light receptors are
2. CMY model---used for color printing • Cones and 26. Define sampling and quantization
3. HIS model----used for color image processing • Rods Sampling means digitizing the co-ordinate value (x, y).
4. YIQ model---used for color picture transmission Quantization means digitizing the amplitude value.
19. Differentiate photopic and scotopic vision
10. What is Chromatic Adoption? 27. Find the number of bits required to store a 256 X 256 image with 32 gray levels
` The hue of a perceived color depends on the adoption of the viewer. For example, Photopic vision Scotopic vision 32 gray levels = 25
the American Flag will not immediately appear red, white, and blue of the viewer has = 5 bits

256 * 256 * 5 = 327680 bits. [Link] effect caused by the user insufficient number of grey levels in smooth areas of prosuces a black .When black is added giving rise to the CMYK color [Link] is four
PART-B digital image is called a false [Link] is visible in images displayed using 16 or coluring printing .
28. Write the expression to find the number of bits to store a digital image? 1. Explain Brightness adaptation and Discrimination lesser gray level values.
The number of bits required to store a digital image is The digital images are displayed as a discrete set of intensities, the eye’s ability PART-C
b=M X N X k to discriminate between different intensity levels. [Link] about Mach band effect?
When M=N, this equation becomes Subjective brightness is a logarithmic function of the light intensity incident on the Two phenomena demonstrate that perceived brightness is not only a function of 1. Describe the fundamental steps in image processing?
b=N^2k eye. The long solid curve represents the range of intensities t o which the visual system intensity. They are mach band pattern and simultaneous contrast. Digital image processing encompasses a broad range of hardware, software and
can adapt. Mach band pattern: theoretical underpinnings.
29. Write short notes on neighbors of a pixel. In photopic vision alone the range is about 10^6. It states that the visual system tends to undershoot or overshoot around the boundary of
The pixel p at co-ordinates (x, y) has 4 neighbors (ie) 2 horizontal and 2 vertical It accomplishes the large variation by changes in its overall sensitivity phenomenon is regions of different intensities .This is called mach band pattern. Although the width of
neighbors whose co-ordinates is given by (x+1, y), (x-1,y), (x,y-1), (x, y+1). This is known as brightness adaptation. the stripe is constant, it is perceived as if the brightness pattern is strongly scalloped near
called as direct neighbors. It is denoted by N4(P) The eye’s ability to discriminate between different intensity levels at any specific the boundaries by darker part.
Four diagonal neighbors of p have co-ordinates (x+1, y+1), (x+1,y-1), (x-1, y-1), adaptation. Simultaneous contrast is related to the fact that a regions perceived brightness does not
(x-1, y+1). It is denoted by ND(4). depend only on its intensity. In the figure all the center square have the same intensity
Eight neighbors of p denoted by N8(P) is a combination of 4 direct neighbors and however they appear to the eye as the background gets lighter.
4 diagonal neighbors. Example: A piece of paper seems white when lying on the desk but can appear when
used to shield the eyes while looking at brighter sky.
30. Explain the types of connectivity.
I+I
1. 4 connectivity 4. Explain color image fundamentals.
2. 8 connectivity Although the process followed by the human brain in perceiving and interpreting
3. M connectivity (mixed connectivity) color is a physiopsychological phenomenon that is not yet fully understood, the physical
nature of color can be expressed on a formal basis supported by experimental and
31. What is meant by path? The eye is capable of detecting contouring effects in monochrome Image theoretical results. The problem domain in this example consists of pieces of mail and the objective is to
Path from pixel p with co-ordinates (x, y) to pixel q with co-ordinates (s,t) is a whose overall intensity is represented by fewer than approximately two dozen levels. Basically, the colors that humans and some other animals perceive in an object are read the address on each piece. Thus the desired output in this case is a stream of
sequence of distinct pixels with co-ordinates. The second phenomenon called simultaneous contrast is related to the fact that a region’s determined by the nature of the light reflected from the object. The visible light is alphanumeric characters.
perceived brightness does not depend on its intensity. They app ear to the eye become composed of a relatively narrow band of frequencies in the electromagnetic spectrum. A The first step in the process is image acquisition that is acquire a digital image .To
32. Give the formula for calculating D4 and D8 distance. dark eras the background gets lighter. body that reflects light that is balanced in all visible wavelengths appears white to the do so requires an imaging sensor and the capability to digitize the signal produced by the
D4 distance ( city block distance) is defined by observer. For example, green objects reflect light with wavelengths primarily in the 500 sensor.
D4(p, q) = |x-s| + |y-t| [Link] sampling and quantization: to 570 nm range while absorbing most of the energy at other wavelengths. After the digital image has been obtained the next step deals with preprocessing
D8 distance(chess board distance) is defined by For computer processing, the image function f(x,y)must be digitized both spatially Three basic quantities are used to describe the quality of a chromatic light source: that image. The key function of this is to improve the image in ways that increase the
D8(p, q) = max(|x-s|, |y-t|). and in amplitude. Digitization of spatial co-ordinates is called image sampling and radiance, luminance and brightness. Radiance is the total amount of energy that flows chances for success of the other processes.
amplitude digitization is called grey level quantization. from the light source, and is usually measured in watts(W). Luminance, measured in The next stage deals with segmentation. Broadly defined segmentation partitions
33. What is geometric transformation? Sampling: lumens(lm), gives a measure of the amount of energy an observer perceives from a loght an input image into its constituent parts or objects. The key role of this is to extract
Transformation is used to alter the co-ordinate description of image. Consider a digital image of size 1024*1024,256 with a display area used for the source. Finally, brightness is a subjective descriptor that is practically impossible to individual characters and words from the background,
The basic geometric transformations are image being the same ,the pixels in the lower resolution images where duplicated inorder measure. The output of the segmentation stage usually is raw pixel data, constituting either
1. Image translation to fulfill the entire display .the pixel replication produced a checker board effect, which is the boundary of a region or all the points in the region itself.
2. Scaling visible in the image of lower resolution .it is not possible to differentiate a 512*512 5. Explain CMY model. Choosing a representation is only part of the solution for transforming raw data
3. Image rotation images from a1024*1024 under this effect. but a slight increase in grainess and a small This model deals about the cyan,magenta and yellow are the secondary colors of into a form suitable for subsequent computer processing. Description also called feature
decrease in sharpness is noted. [Link] a surface coated with cyan pigment is illuminated with white light no red selection deals with extracting features that result in some quantitative information of
34. What is image translation and scaling? A 256*256 image shows a fine checker board pattern in the edges and more lihgt is reflected from the [Link] subtracts red light from reflected white interest that are basic for differentiating one class of object from another.
Image translation means reposition the image from one co-ordinate location to pronounced grainess there out the image .these effect is much more visible in 128*128 light,which itself is composed of equal amounts of red,green and blue [Link] this mode The last stage involves recognition and interpretation. Recognition is the process
another along straight line path. images and it becomes quite pronounced in 64*64 and 32*32 images. cyan data input or perform an RGB to CMY conversion internally. that assigns a label to an object based on the information provided by its descriptors.
Scaling is used to alter the size of the object or image (ie) a co-ordinate system is Quantization: C=1-R Interpretation involves assigning meaning to an ensemble of recognized objects.
scaled by a factor. It discusses the effects produced when the number of bits used to represent the M= 1 - G Knowledge about a problem domain is coded into an image processing system in
grey level in an image is decreased .this is illustrated by reducing the grey level required Y= 1 - B the form of knowledge database. This knowledge may be simple as detailing regions of
35. Define the term Luminance to represent a 1024*1024,512 image. All color values have been normalized to the range [0,1].the light reflected from a surface an image where the information of interest is known to be located thus limiting the search
Luminance measured in lumens (lm), gives a measure of the amount of energy an The 256,128,and 64 level image are visually identical for all practical purposes coated with pure cyan does not contain red .RGB values can be obtained easily from a set that has to be conducted in seeking that information.
observer perceiver from a light source. the 32 level images has developed a set of rigid like structure in areas of smooth grey of CMY values by subtracting the individual Cmy values from [Link] these colors

The knowledge base also can be quite complex such as an interrelated list of all this rate require at last 5 [Link] link using intermediate stations such as satellites The innermost membrance of the eye is retina, which lines the inside of the wall’s • Intensity is a most useful descriptor of monochromatic images.
major possible defects in a materials inspection problem or an image database containing are much faster but they are costly. entire posterior portion. There are 2 classes of receptors, Converting colors from RGB to HSI
high resolution satellite images of a region in connection with change detection 5)display: 1. Cones Given an image in RGB color format ,
application. Monochrome and colour tv monitors are the principal display devices used in modern 2. Rods • the H component of each RGB pixel is obtained using the equation
Although we do not discuss image display explicitly at this point it is [Link] are driven by the outputs of the hardware in the display module of the Cones: H = {theta if B<=G
important to keep in mind that viewing the results of image processing can take place at computer. The cones in each eye between 6and7 million. They are located primarily in the
360-theta if B>G
the output of any step. 3. Explain the Structure of the Human eye central portion of the retina called the fovea, and highly sensitive to Colour.
Rods:
2. Explain the basic Elements of digital image processing: The number of rods is much larger; some 75 to 150 millions are distributed over the with theta = cos-1{1/2[R-G) +(R-B)/[(R-G)2 + (R-B)(G-B)]1/2}
Five elements of digital image processing, retinal surface. • The saturation component is given by
• image acquisitions Fovea as a square sensor array of size 1.5mm*1.5mm. S =1-3/(R+G+B)[min(R,G,B)]
• storage • the intensity component is given by
• processing 4. Explain the RGB model I=1/3(R+G+B)
• communication RGB model,each color appears in its primary spectral components of red ,green Converting colors from HSI to RGB
• display and [Link] model is based on a Cartesian coordinate [Link] color subspace of Given values of HSI in the interval [0,1],we now want to find the corresponding
1)Image acquisition : interest is the [Link] values are at three corners [Link] and yellow are at three RGB values in the same range .We begin by multiplying H by 360o,which returns the hue
Two devices are required to acquire a digital image ,they are other corner black is at the origin and white is the at the corner farthest from the origin to its original range of [0o,360o]
1)physical device: this model the gray scale extends from black to white along the line joining these two RG sector(0o<=120o).when h is in this sector ,the RGB components are given by the
Produces an electric signal proportional to the amount of light energy sensed. points .The different colors in this model are points on or inside the cube and are defined equations
2)a digitizer: by vectors extending from the origin. B = I (1 - S)
Images represented in the RGB color model consist of three component
Device for converting the electric output into a digital form. R = I [1 + S cos H/cos(60o - H)]
[Link]: images,one for each primary [Link] no of bits used to represented each pixel in which
each red,green and blue images is an 8 bit [Link] RGB color pixel of values is said G = 1 - (R + B)
An 8 bit image of size 1024*1024 requires one million bits of [Link] types
of storage: to be 24 bits .The total no of colors in a 24 bit RGB images is 92803=16777,216.
[Link] term storage: The acquiring a color image is basically the process is shown in fig,. A color image can GB Sector(120o <= H < 240o).If the given value of H is in this ,we first subtract 120o
It is used during processing. it is provide by computer memory. it consisits of be acquired by using three filters,sensitive to red,green and [Link] we view a color from it
frame buffer which can store one or more images and can be accessed quickly at the scene with a monochrome camera equipped with one of these filters the result is a H = H -120o
video rates. monochrome image whose intensity is proportional to the response of that filter. Then the RGB components are
The eye is early a sphere, with an average diameter of approximately 20 mm. Repeating this process with each filter produces three monochrome images that are the B = I (1 – S)
[Link] storage:
Three membrance encloses the eye, RGB component images of the color [Link] subset of color is called the set of safe
It is used for fast recall. It normally uses the magnetic disk,Winchester disk with100s 0f
1. Cornea
G = I [1 + S cos H/cos(60o - H)]
megabits are commonly used . RGB colors or the set of all system safe colors. In inter net applications they are called
2. Sclera or Cornea: safe Web colors or safe browser [Link] are 256 colors are obtained from different B = 1 - (R + G)
[Link] storage: BR Sector(240o <=H<=360o).Finally if H is in this range we subtract 240o from it
3. Retina combination but we are using only 216 colors .
They are passive storage devices and it is used for infrequent [Link]
tapes and optical disc are the media. High density magnetic tapes can store 1 megabit in
. The cornea is a tough, transparent tissue that covers the anterior surface of the eye. H = H - 240o
Sclera: Then the RGB components are
about 13 feet of tape .
Sclera is an opaque membrance e that encloses the remainder of the optical globe. [Link] the HSI color image model G = I (1 - S)
3)Processing:
Choroid: The HSI Color Model B = I [1 + S cos H/cos(60o - H)]
Processing of a digital image p involves procedures that are expressedin terms of
-Choroid directly below the sclera. This membrance contains a network of blood The RGB,CMY and other color models are not well suited for describing colors in
algorithms .with the exception of image acquisition and display most image processing
vessels that serve as the major source of nutrition to the eye.
R = 1 - (G + B)
functions can be implemented in software .the need for a specialized hardware is called terms that are practical for human [Link] eg,one does not refer to the color of
6. Describe the basic relationship between the pixels
-Choroid coat is heavily pigmented and helps to reduce the amount of extraneous an automobile by giving the percentage of each of the primaries composing its color.
increased speed in application. Large scale image processing systems are still being used 2-D Mathematical preliminaries
light entering the eye. When humans view a color object we describe it by its hue, saturation and
for massive image application .steps are being merge for general purpose small computer • Neighbours of a pixel
-The choroid is divided into the ciliary body and the iris diaphragm. brightness.
equipped with image processing hardware. • Adjacency, Connectivity, Regions and Boundaries
Lens: • Hue is a color attribute that describes a pure color.
4)communication: • Distance measures
The lens is made up of concentric lay ours of fibrous cells and is suspended by • Saturation gives a measure of the degree to which a pure color is diluted by white
Communication in ip involves local communication between ip systems and remote Neighbours of a pixel
fibrous that attach to the ciliary body. It contains 60to 70% of water about 60%fat and m light.
communication from one point to another in communication with the transmission of • A pixel p at coordinates (x,y) has four horizontal and vertical neighbours whose
ore protein than any other tissue in the eye. • Brightness is a subjective descriptor that is practically impossible to measure. It
image hardware and software are available for most of the computers .the telephone line coordinates are given by
Retina: embodies the achromatic notion of intensity and is one of the key factors in
can transmit a max rate of 9600 bits per [Link] to transmit a 512*512,8 bit image at (x+1,y), (x-1,y), (x,y+1), (x,y-1).
describing color sensation

• This set of pixels, called the 4-neighbours of p, is denoted by N4(p). Each pixel is 9. Specify the properties of 2D Fourier transform. 15. State convolution theorem for 1D
a unit distance from (x,y) and some of the neighbours of p lie outside the digital 3. What are the applications of transform? The properties are • If f(x) has a fourier transform F(u) and g(x) has a fourier transform G(u) then
image if (x,y) is on the border of the image. 1) To reduce band width • Separability f(x)*g(x) has a fourier transform F(u).G(u).
• The four diagonal neighbours of p have coordinates 2) To reduce redundancy • Translation • Convolution in x domain can be obtained by taking the inverse fourier transform
(x+1,y+1), (x+1,y-1), (x-1,y+1), (x-1,y-1) 3) To extract feature. • Periodicity and conjugate symmetry of the product F(u).G(u).
• And are denoted by ND(p). These points together with the 4-neighbours are called • Rotation • Convolution in frequency domain reduces the multiplication in the x domain
the 8-neighbours of p, denoted by N8(p). 4. Give the Conditions for perfect transform • Distributivity and scaling f(x).g(x)  F(u)* G(u)
Adjacency, Connectivity, Regions and Boundaries Transpose of matrix = Inverse of a matrix. • These 2 results are referred to the convolution theorem.
• Average value
Three types of adjacency: Orthoganality.
• Laplacian
• 4-adjacency. Two pixels p and q with values from V are 4-adjacent if q is in the 16. What is wrap around error?
5. What are the properties of unitary transform? • Convolution and correlation
set N4(p). The individual periods of the convolution will overlap and referred to as
• 8-adjacency. Two pixels p and q with values from V are 8-adjacent if q is in the 1) Determinant and the Eigen values of a unitary matrix have unity magnitude • sampling
wrap around error
set N8(p). 2) the entropy of a random vector is preserved under a unitary Transformation
3) Since the entropy is a measure of average information, this means information 10. Mention the separability property in 2D Fourier transform
• M-adjacency. Two pixels p and q with values from V are m-adjacent if 17. Give the formula for correlation of 1D continuous function.
is preserved under a unitary transformation. The advantage of separable property is that F(u, v) and f(x, y) can be obtained by
q is in N4(p), or q is in ND(p) and the set N4(p)  N4(q) has no pixels whose The correlation of 2 continuous functions f(x) and g(x) is defined by
6. Define Fourier transform pair successive application of 1D Fourier transform or its inverse.
values are from V. f(x) o g(x) = ∫ f*(α ) g(x+α ) dα
• A (digital) path (or curve) from pixel p with coordinates (x,y) to pixel q with The Fourier transform of f(x) denoted by F(u) is defined by n-1
F(u, v) =1/N ∑ F(x, v) e –j2πux/N
coordinates (s,t) is a sequence of distinct pixels with coordinates ∝
-j2πux
(x0,y0), (x1,y1),................(xn,yn) F(u)= ∫ f(x) e dx ----------------(1) x=0
18. What are the properties of Haar transform.
-∝ Where
Where (x0,y0)= (x,y), (xn,yn)=(s,t) and pixels (xi,yi) and (xi-1,yi-1) are adjacent for 1. Haar transform is real and orthogonal.
1<=i<=n. N is the length of the path. The inverse fourier transform of f(x) is defined by n-1
2. Haar transform is a very fast transform
F(x, v)=N[1/N ∑ f(x, y) e –j2πvy/N
Distance measures ∝ y=0 3. Haar transform has very poor energy compaction for images
• For pixels p,q and z with coordinates (x,y), (s,t) and (v,w) respectively, D is a f(x)= ∫F(u) ej2πux dx --------------------(2) 4. The basic vectors of Haar matrix sequensly ordered.
11. List the Properties of twiddle factor.
distance function or metric if -∝
1. Periodicity
 D(p,q)>=0 (D(p,q)=0 iff p=q), The equations (1) and (2) are known as fourier transform pair. 19. What are the Properties of Slant transform
WN^(K+N)= WN^K
 D(p,q) = D(q,p) and 1. Slant transform is real and orthogonal.
 D(p,z) <= D(p,q) + D(q,z) 7. Define Fourier spectrum and spectral density 2. Slant transform is a fast transform
2. Symmetry
• The Euclidean distance between p and q is defined as, Fourier spectrum is defined as 3. Slant transform has very good energy compaction for images
WN^(K+N/2)= -WN^K
 De(p,q) = [(x-s)2+(y-t)2]
F(u) = |F(u)| e jφ(u) 4. The basic vectors of Slant matrix are not sequensely ordered.
• The D4 distance (also called city-block distance) between p and q is defined as
Where
12. Give the Properties of one-dimensional DFT
|F(u)| = R2(u)+I2(u) 20. Specify the properties of forward transformation kernel
 D4(p,q) = |x-s|+|y-t| 1. The DFT and unitary DFT matrices are symmetric.
φ(u) = tan-1(I(u)/R(u)) The forward transformation kernel is said to be separable if g(x, y, u, v)
• The D8 distance (also called chessboard distance) between p and q is defined as 2. The extensions of the DFT and unitary DFT of a sequence and their
Spectral density is defined by g(x, y, u, v) = g1(x, u).g2(y, v)
 D8(p,q) = max( |x-s|+|y-t|) inverse transforms are periodic with period N.
p(u) = |F(u)|2 The forward transformation kernel is symmetric if g1 is functionally equal to g2
3. The DFT or unitary DFT of a real sequence is conjugate symmetric
p(u) = R2(u)+I2(u) g(x, y, u, v) = g1(x, u). g1(y,v)
about N/2.
UNIT II
8. Give the relation for 1-D discrete Fourier transform pair 21. Define fast Walsh transform.
1. What is the need for transform? 13. Give the Properties of two-dimensional DFT
The discrete Fourier transform is defined by The Walsh transform is defined by
The need for transform is most of the signals or images are time domain signal (ie) 1. Symmetric
n-1
signals can be measured with a function of time. This representation is not always best. 2. Periodic extensions n-1 x-1
F(u) = 1/N ∑ f(x) e –j2πux/N w(u) = 1/N ∑ f(x) π (-1) bi(x).bn-1-i (u)
For most image processing applications anyone of the mathematical transformation are x=0 3. Sampled Fourier transform
applied to the signal or images to obtain further information from that signal. 4. Conjugate symmetry. x=0 i=0
The inverse discrete Fourier transform is given by
2. What is Image Transform? n-1
14. What is meant by convolution? 22. Give the relation for 1-D DCT.
f(x) = ∑ F(u) e j2πux/N • The 1-D DCT is,
An image can be expanded in terms of a discrete set of basis arrays called basis The convolution of 2 functions is defined by
x=0 N-1
images. These basis images can be generated by unitary matrices. Alternatively, a given f(x)*g(x) = ∫ f(α ) .g(x- α ) dα
These equations are known as discrete Fourier transform pair. C(u)=α(u)∑ f(x) cos[((2x+1)uп)/2N] where u=0,1,2,….N-1
NxN image can be viewed as an N^2x1 vectors. An image transform provides a set of where α is the dummy variable
coordinates or basis vectors for vector space. X=0
N-1 K=1 F(u)=1/2{Feven(u)+Fodd(u).W2Mu} PART-C
• Inverse f(x)= ∑ α(u) c(u) cos[((2x+1) uп)/2N] where x=0,1,2,…N-1 F(u+M)=1/2{1/M x=0∑ M-1f(2x)WM(u+M)x + 1/M x=0∑ M-1f(2x+1)WM(u+M)x .W2Mu+M }
V=0 K-L Transform Y= A(X- MX) F(u+M)=1/2{1/M x=0∑ M-1f(2x)WMux - 1/M x=0∑ M-1f(2x+1)WMux .W2Mu } 1Explain Discrete Fourier Transform in detail.
F(u+M)=1/2[Feven(u)+Fodd(u) .W2Mu] • ID Case
23. Write slant transform matrix SN. 26. Give the equation for singular value decomposition of an image? F(u)=1/N x=0∑N-1f(x)exp[-j2πux/N] for u=0,1,2,…….N-1------------------(1)
[Link] short notes on Hotelling transform
f(x)= u=0∑N-1F(u)[j2πux/N], for x=0,1,2,…….N-1--------------------------(2)
1 0 Consider a set of n or multi-dimensional discrete signal represented as column vector Equations (1) and (2) called Discrete Fourier transform pair
1 0 0 0 U= m=1∑rψ√λm φmT
-aN bN x1,x2,…xn each having M elements, The values u=0,1,2,………N-1 in the discrete Fourier transform corresponds to the
aN bN This equation is called as singular value decomposition of an image.
S samples of the continuous transform at values 0, ∆u, 2∆u….(N-1)∆u. In other words F(u)
0 X1 corresponds F(u∆u). The terms ∆u and ∆x related by the expression ∆u=1/N∆x
0 I(N/2-1) 0 I(N/2-1) N/2 27. Write the properties of Singular value Decomposition(SVD)?
• The SVD transform varies drastically from image to image. X2 • 2D Case
• The SVD transform gives best energy packing efficiency for any given image. X= . F(u,v)=1/MN x=0∑M-1y=0∑N-1f(x,y)exp[-j2πux/M+vy/N]
SN = 1/√2 1 0 0 -1 . for u=0,1,2,…….M-1, v=0,1,2,……..N-1
0 0 • The SVD transform is useful in the design of filters finding least square,minimum
-bN aN bN aN Xn f(x,y)=x=0∑M-1y=0∑N-1F(u,v)exp[j2πux/M+vy/N]
solution of linear equation and finding rank of large matrices.
S for x=0,1,2,…….M-1, y=0,1,2,……..N-1
0
N/2 • The mean vector is defined as Mx=E{x},Where E{x} is the expected value of x. For a square image M=N, FT pair will be
0 I(N/2-1) 0 -I(N/2-1)
PART-B M F(u, v)=1/N x=0∑N-1y=0∑N-1f(x,y)exp[-j2π(ux +vy)/N]
For M vector samples mean vector is Mx=1/M ∑ Xk for u, v=0,1,2,…….N-1
1. Write short notes on Discrete Cosine Transform (DCT) K=1 f(x, y)=x=0∑N-1y=0∑N-1F(u,v)exp[j2π(ux+vy)/N]
• I-D DCT for x, y=0,1,2,…….N-1
24. Define Haar transform. I-D DCT is defined as • The co-variant matrix is, Cx=E{(X-Mx)(X-Mx)}
The Haar transform can be expressed in matrix form as, C(u)= α(u) x=0∑N-1f(x)cos[(2x+1)uπ/2N] for u=0,1,2,…….N-1 Where T-vector transposition X->N-D vector,Cx->nxn matrix. [Link] the Properties of 2D discrete Fourier Transform
T=HFH Inverse DCT is defined as M 1. Separability
Where F = N X N image matrix f(x)= u=0∑N-1 α(u)C(u)cos[(2x+1)uπ/2N] for x=0,1,2,…….N-1 For M samples, Cx=1/M ∑ (xk-Mx)(xk-Mx). F(u, v)=1/N x=0∑N-1y=0∑N-1f(x,y)exp[-j2π(ux +vy)/N] for u, v=0,1,2,…….N-1
H = N X N transformation matrix In both cases α(u)=1/√N for u=0 and √2/√N for u=1,2,…….N-1 K=1 f(x, y)=x=0∑N-1y=0∑N-1F(u,v)exp[j2π(ux+vy)/N] for x, y=0,1,2,…….N-1
T = resulting N X N transform. • 2-D DCT F(u,v)=1/N x=0∑N-1F(x,v)exp[-j2πux/N]
I-D DCT is defined as K-L Transform Y= A(X- MX) where F(x,v)=N[1/Ny=0∑N-1f(x,y)exp[-j2πvy/N
25. Define K-L transform. C(u,v)= α(u) α(v) x=0∑N-1 y=0∑N-1f(x,y)cos[(2x+1)uπ/2N]cos[(2y+1)vπ/2N] 2. Translation
Consider a set of n or multi-dimensional discrete signal represented as column vector for u,v=0,1,2,…….N-1 The translation properties of the Fourier Transorm pair are
x1,x2,…xn each having M elements, Inverse DCT is defined as f(x,y)exp[-j2π(u0x +v0y)/N] F(u-u0,v-v0) are Fourier Transform pair.
• Features:
f(x,y)=u=0∑N-1 v=0∑N-1 α(u) α(v)C(u,v)cos[(2x+1)uπ/2N] cos[(2y+1)uπ/2N] And f(x-x0,y-y0) F(u,v)exp[-j2π(ux0 +vy0)/N]
X1 for x,y=0,1,2,…….N-1 Where the double arrow indicates the correspondence between a function and its
X2 In both cases α(u)=1/√N for u=0 and √2/√N for u=1,2,…….N-1 Fourier transform.
X= . o HT is based on the statistical properties of vector representation 3. Periodicity and Conjugate Symmetry
. 2. Describe Fast Fourier Transform o HT has several useful properties that makes an important tool for image • Periodicity:
Xn • The Fourier transform of f(x) denoted by F(u) is defined by processing The Discrete Fourier Transform and its inverse are periodic with period N; that is,

o It converts discrete signal s into a sequence of uncorrelated coefficients F(u,v)=F(u+N,v)=F(u,v+N)=F(u+N,v+N)
The mean vector is defined as Mx=E{x} F(u)= ∫ f(x) e -j2πux
dx ----------------(1) • Conjugate symmetry:
Where E{x} is the expected value of x. M -∝ If f(x,y) is real, the Fourier transform also exhibits conjugate symmetry,
For M vector samples mean vector is Mx=1/M ∑ Xk • The inverse fourier transform of f(x) is defined by F(u,v)=F*(-u,-v) or │F(u,v) │=│F(-u,-v) │ where F*(u,v) is the complex
K=1 ∝ conjugate of F(u,v)
T f(x)= ∫F(u) ej2πux dx --------------------(2)
The co-variant matrix is, Cx=E{(X-Mx)(X-Mx)} -∝
Where T-vector transposition X->N-D vector The equations (1) and (2) are known as fourier transform pair.
Cx->nxn matrix. M T F(u)=1/2M x=0∑ 2M-1f(x) W2Mux 4. Rotation
For M samples, Cx=1/M ∑ (xk-Mx)(xk-Mx). F(u)=1/2{1/M x=0∑ M-1f(2x)W2Mu2x + 1/M x=0∑ M-1f(2x+1)W2Mu(2x+1)
F(u)=1/2{1/M x=0∑ M-1f(2x)WMux + 1/M x=0∑ M-1f(2x+1)WMux .W2Mu }

∑ ∑ ∑
N −1 N −1
Polar Coordinates x=rcosθ, y=rsinθ, u=wsinΦ, v=wsinΦ then f(x,y) and F(u,v) n −1

become f(r,θ) and F(w,Φ) respectively. Rotating f(x,y) by an angle θ0 rotates F(u,v) by f(x ) = u =0 H(u) i =0 bi(x) bi(u) f(x)= u =0 H(u) h(x,u) u v 0 1 2 3 4 5 6 7

the same angle. Similarly rotating F(u,v) rotates f(x,y) by the same angle. 0 + + + + + + + +

∑ ∑
N −1 n −1
i.e, f(r,θ+ θ0) F(w,Φ+ θ0) 1 + + + + - - - -

5. Distributivity and scaling f(x ) = u =0 H(u) h(x,u) x=0,1….N-1 where h (x, u)= (-1) i =0 bi(x) pi(u) 2 + + - - + + - -


n −1
• Distributivity: 3 + + - - - - + +
h(x,u) = (-1) i =0 bi(x) bi(u) • 2D ordered HT Pair
The Discrete Fourier Transform and its inverse are distributive over addition but 4 + - + - + - + -

∑ ∑ f(x , y) (-1) ∑
n −1 N −1 n −1
not over multiplication. 5 + - + - - + - +
• 2D Hadamard Transform H(u,v) = 1/N x =0 y =0 i =0 [bi(x) pi(u) + bi(y)
F[f1(x,y)+f2(x,y)]=F[f1(x,y)]+F[f2(x,y)] 6 + - - + + - - +
pi(v)]
F[f1(x,y).f2(x,y)]≠F[f1(x,y)].F[f2(x,y)] 7 + - - + - + + -

∑ ∑
n −1 N −1

f(x , y) (-1) ∑
n −1
• Scaling ∑ ∑
n −1 N −1

H(u,v) = 1/N x =0 y =0 i =0 b (x) b (u) + b (y) b (v) = x =0 y =0


f(x , y) g(x,y,u,v)
For the two scalars a and b, i i i i
2D Walsh Transform
Af(x,y) aF(u,v) and f(ax,by) 1/│ab│F(u/a,v/b) W(u,v)=1/N x=0∑ N-1 y=0∑ N-1f(x,y) i=0Π n-1(-1)bi(x)bn-1-i(u)+ bi(y)bn-1-i(v)
(-1) ∑
n −1
6. Laplacian
∑ ∑
n −1 N −1
=
f(x , y) g(x,y,u,v) where g (x, y ,u , v)= 1/N i =0 [bi(x) pi(u) + bi(y) pi(v)] forward transformational kernel, g(x,y,u,v)=1/N{ i=0Π n-1(-1)bi(x)bn-1-i(u) + bi(y)bn-1-i(v)}
x =0 y =0
The Laplacian of a two variable function f(x,y) is defined as ▼2f(x,y)=∂2f/∂x2+∂2f/∂y2 W(u,v)= x=0∑ N-1 y=0∑ N-1f(x,y)g(x,y,u,v)

n −1
7. Convolution and Correlation
∑ ∑ ∑
where g (x, y ,u , v)= 1/N (-1) b (x) b (u) + b (y) b (v) n −1 N −1 n −1
i =0
i i i i
• Convolution Similarly ,f(x,y)= = 1/N u =0 v =0 H(u.v) (-1) i =0 [bi(x) pi(u) + bi(y) pi(v)] 2D Inverse Walsh Transform

n −1

= 1/N ∑u =0 ∑v =0
n −1 N −1
The convolution of two functions f(x) and g(x) denoted by f(x)*g(x) and is Similarly ,f(x,y)=
H(u.v) (-1) i =0 b (x) b (u) + b (y) b (v) f(x,y) =u=0∑ N-1 v=0∑ N-1W(u,v) i=0Π n-1(-1)bi(x)bn-1-i(u) + bi(y)bn-1-i(v)
i i i i
∑ ∑
n −1 N −1
defined by the integral, f(x)*g(x)=-∞∫∞f(α)g(x-α)dα where α is a dummy variable.
Convolution of two functions F(u) and G(u) in the frequency domain f(x,y)= = u =0 v =0 H (u.v) h(x, y , u , v) f(x,y) = u=0∑ N-1 v=0∑ N-1W(u,v)h(x,y,u,v)
∑ ∑
n −1 N −1
=multiplication of their inverse f(x) and g(x) respectively. f(x,y)=
= H (u.v) h(x, y , u , v) inverse transformational kernel, h(x,y,u,v)= i=0Π n-1(-1)bi(x)bn-1-i(u) + bi(y)bn-1-i(v)
u =0 v =0

n −1
Ie, f(x)*g(x) F(u)G(u)
Where h( x , y , u , v) = 1/N (-1) ∑i =0 [ bi(x) bi(u) + bi(y) bi(v) ] which is the inverse
n −1
Where h( x , y , u , v) = 1/N (-1) i =0 [ bi(x) pi(u) + bi(y) pi(v)]
• Correlation Walsh transformational kernels are separable and symmetric
The correlation of two functions f(x) and g(x) denoted by f(x)оg(x) and is defined kernel Therefore ,forward and reverse kernel are same g(x,y,u,v)=g1(x,u) g1(y,v)
by the integral, f(x)оg(x)=-∞∫∞f*(α)g(x+α)dα where α is a dummy variable. • Ordered Hadamard Transform h(x,y,u,v) =h1(x,u) h1(y,v)
4. Explain Walsh Transform with suitable equations
For the discrete case fe(x)оge(x)= 1/M M=0∑M-1f*(m)g(x+m) 1D Ordered Hadamard Transform {1/√N i=0Π n-1(-1)bi(x)bn-1-i(u)} {1/√N i=0Π n-1(-1)bi(y)bn-1-i(v)}
Walsh Transform
fe(x)= {f(x), 0≤x≤A-1,
f(x) (-1) ∑i =0
n −1

N −1
{0 , A≤x≤M-1 H(u)=1/N x =0 b (x) p (u) Walsh Transform can be obtained by successive doubling method
i i
For N=2n, Walsh transform function f(x) is denoted by W(u).W(u) is obtained by W(u)=1/2[Weven(u)+Wodd(v)]
ge(x)= {g(x), 0≤x≤B-1,
substituting the forward transformational kernel, W(u+m)=1/2[Weven(u)-Wodd(v)]
{0 , B≤x≤N-1

n −1
= x =0 f (x) g (x , u ) M=N/2 u=0,1,….,N-1
[Link] Hadamard transform in detail
where g (x, u)= 1/N (-1) ∑i =0 bi(x) pi(u)
n −1
• ID Hadamard transform g(x,u)=1/N i=0Π n-1(-1)bi(x)bn-1-i(u) 5. Explain Discrete cosine transform in detail
The Walsh transform of f(x)
po(u) = b (n-1) (u) The discrete cosine transform(DCT) gets its name from the fact that the rows of
W(u)=1/N x=0∑ N-1f(x) i=0Π n-1(-1)bi(x)bn-1-i(u) the N*N transform matrix C are obtained as a function of cosines.
p1 (u) = b (n-1) (u) + b (n-2) (u)
f(x) (-1) ∑i =0
n −1

N −1
H(u)=1/N x =0
b (x) b (u) W(u)= x=0∑ N-1f(x)g(x,u) (2 j +1)iΠ
i i
Where bk(z) is the kth bit in the binary representation of z |C|i,j= 1 / N cos i=0,j=0,1,....N-1
p2 (u) = b (n-2) (u) + b (n-3) (u) 2N
. (2 j +1)iΠ

n −1

= x =0 f (x) g (x , u ) . • Inverse Walsh Transform 2 / N cos i=1,2.....N-1,j=0,1,....N-1


2N

where g (x, u)= 1/N (-1) ∑i =0 bi(x) bi(u) which is known as 1D forward
n −1
. The rows of the transform matrix are shown in graphical form. Here the amount
. f(x) =u=0∑ N-1W(u) i=0Π n-1(-1)bi(x)bn-1-i(u)
of variation increases as we progress down the row; that is the frequency of the rows
Hadamard kernel. pn-1 (u) = b1(u) + b o (u) f(x) = u=0∑ N-1W(u)h(x,u)
increases as we go from top to bottom.
bk(x) is the kth bit binary representation of z inverse transformational kernel, h(x,u)= i=0Π n-1(-1)bi(x)bn-1-i(u)
bi (z)=1
• Inverse Hadamard Transform forward and inverse transformational kernel differs by 1/N
∑ H(u) (-1) ∑i =0 bi(x) pi(u)
N −1 n −1
• Inverse 1D Hadamard Transform f(x)= u =0
When N=8

In DFT, to find the Fourier coefficients for s sequence of length N, we assume • The enhancement technique based on this type of approach is referred to as mask ▼f = ∂ f / ∂ x + ∂ f / ∂ y
that the sequence is periodic with period N. The DFT assumes that the sequence outside processing.
the interval behaves in a different manner. This introduces sharp discontinuities, at the
beginning and the end of the sequence. In order to represent these sharp discontinuities,
the DFT needs nonzero coefficients for the high-frequency components. Because these 5. Define histogram. 10. Write the steps involved in frequency domain filtering.
components are needed only at the two end points of the sequence, their effect needs to The histogram of a digital image with gray levels in the range [0, L-1] is a 1. Multiply the input image by (-1) to center the transform.
be cancelled out at other points in the sequence. Thus, the DFT adjusts other coefficients discrete function h(rk)=nk. 2. Compute F(u,v), the DFT of the image from (1).
6) accordingly. When we discard the high frequency coefficients during the compression rk-kth gray level 3. Multiply F(u,v) by a filter function H(u,v).
0) process, the coefficients that were cancelling out the high-frequency effect in other parts nk-number of pixels in the image having gray level rk. 4. Compute the inverse DFT of the result in (3).
of the sequence result in the introduction of additional distortion. 5. Obtain the real part of the result in (4).
3) The DCT can be obtained using the DFT by mirroring the original N-point [Link] is meant by histogram equalization? 6. Multiply the result in (5) by (-1)
sequence to obtain a 2N-point sequence. The DCT is simply the first N points of the
resulting 2N-point DFT. When we take the DFT of the 2N-point mirrored sequence, we 11. What do you mean by Point processing?
again have to assume periodicity. Here it does not introduce any sharp discontinuities at k k Image enhancement at any Point in an image depends only on the gray level at
the edges. Sk= T(rk) = ∑ Pr(rj) = ∑ nj/n where k=0,1,2,….L-1 that point is often referred to as Point processing.
The DCT is better at energy compaction for most correlated sources when j=0 j=0
compared to the DFT. For Markov sources with high correlation coefficient ρ , This transformation is called histogram equalization. 12. Define Derivative filter?
E[ xnxn + 1] For a function f (x, y), the gradient f at co-ordinate (x, y) is defined as the vector
ρ= , ∆f = ∂f/∂x
E[ xn 2] [Link] linear spatial filter and non-linear spatial filter.
∂f/∂y
7) The compaction ability of the DCT is very close to that of the KLT. As many sources can ∆f = mag (∆f) = {[(∂f/∂x) 2 +(∂f/∂y) 2 ]} 1/2
be modelled as Markov sources with high values for ρ , this superior compaction ability [Link]. Linear spatial filter Non-linear spatial filter
1) has made the DCT the most popular transform. It is a part of many international 13. Define spatial filtering
4) standards, including JPEG,MPEG and CCITT H.261. 1. Response is a sum of products of They do not explicitly use co- Spatial filtering is the process of moving the filter mask from point to point in an
the filter co-efficient. efficients in the sum-of-products. image. For linear spatial filter, the response is given by a sum of products of the filter
coefficients, and the corresponding image pixels in the area spanned by the filter mask.
UNIT III 2. R = w(-1,-1) f(x-1,y-1) + R = w1z1 + w2z2 + … +w9z9
w(-1,0) f(x-1,y) + … + 9 14. What is a Median filter?
w(0,0) f(x,y) + … + = ∑ wizi The median filter replaces the value of a pixel by the median of the gray levels in
1. Specify the objective of image enhancement technique. w(1,0) f(x+1,y) + i=1 the neighborhood of that pixel.
The objective of enhancement technique is to process an image so that the result is w(1,1) f(x+1,y+1).
more suitable than the original image for a particular application. 15. What is maximum filter and minimum filter?
8. Give the mask used for high boost filtering. The 100th percentile is maximum filter is used in finding brightest points in an
2) 2. List the 2 categories of image enhancement. image. The 0th percentile filter is minimum filter used for finding darkest points in an
5) • Spatial domain refers to image plane itself & approaches in this category are image.
based on direct manipulation of picture image.
• Frequency domain methods based on modifying the image by fourier transform. 0 -1 0 -1 -1 -1 16. Write the application of sharpening filters
1. Electronic printing and medical imaging to industrial application
3. What is the purpose of image averaging? 2. Autonomous target detection in smart weapons.
-1 A+4 -1 -1 A+8 -1
An important application of image averaging is in the field of astronomy, where
Fig .Basic set of the discrete cosine transform. The numbers correspond to the row of the imaging with very low light levels is routine, causing sensor noise frequently to render 17. Name the different types of derivative filters
transform matrix. single images virtually useless for analysis. 1. Perwitt operators
0 -1 0 -1 -1 -1
Also, the basis matrices show increased variation as we go from the top-left 2. Roberts cross gradient operators
matrix, corresponding to the θ 00 coefficient, to the bottom-right matrix, corresponding 4. What is meant by masking? 3. Sobel operators
to the θ (N-1)(N-1) coefficient. • Mask is the small 2-D array in which the values of mask co-efficient determines
The DCT is closely related to the discrete Fourier transform(DFT) and DCT can the nature of process. 9. What is meant by laplacian filter? 18. What is meant by Image Restoration?
be obtained from DFT. In terms of compression, the DCT performs better than the DFT. The laplacian for a function f(x,y) of 2 variables is defined as, Restoration attempts to reconstruct or recover an image that has been degraded by
2 2 2 2 2 using a clear knowledge of the degrading phenomenon.

. Guassian noise: The simplest approach to restoration is direct inverse filtering, an estimate F^(u,v)
19. What are the two properties in Linear Operator? he(M-1) he(M-2) he(M-3)………. he(0) The PDF guassian random variable Z is given by of the transform of the original image simply by dividing the transform of the degraded
• Additivity P(Z)=e-(Z-μ)2/2σ2/√2πσ image G^(u,v) by the degradation function.
• Homogenity Z->Gray level value F^ (u,v) = G^(u,v)/H(u,v)
20. Give the additivity property of Linear Operator 26. What is the concept algebraic approach? σ->standard deviation
H[f1(x,y)+f2(x,y)]=H[f1(x,y)]+H[f2(x,y)] The concept of algebraic approach is to estimate the original image which σ2->varianze of Z 41. What is pseudo inverse filter?
The additive property says that if H is the linear operator,the response to a sum of minimizes a predefined criterion of performances μ->mean of the graylevel value Z It is the stabilized version of the inverse [Link] a linear shift invariant system
two is equal to the sum of the two responses. with frequency response H(u,v) the pseudo inverse filter is defined as
27. What are the two methods of algebraic approach? 35. Give the relation for rayleigh noise H-(u,v)=1/(H(u,v) H=/0
21. How a degradation process is modeled? • Unconstrained restoration approach Rayleigh noise: 0 H=0
• Constrained restoration approach The PDF is 42. What is meant by least mean square filter?
P(Z)= 2(z-a)e-(z—a)2/b/b for Z>=a The limitation of inverse and pseudo inverse filter is very sensitive noise. The
28. Define Gray-level interpolation 0 for Z<a wiener filtering is a method of restoring images in the presence of blurr as well as noise.
η(x,y) Gray-level interpolation deals with the assignment of gray levels to pixels in the
spatially transformed image mean μ=a+√πb/4 43. Give the difference between Enhancement and Restoration
standard deviation σ2=b(4-π)/4 • Enhancement technique is based primarily on the pleasing aspects it might present
H
g(x,y) 29. What is meant by Noise probability density function? to the viewer. For example: Contrast Stretching.
The spatial noise descriptor is the statistical behavior of gray level values in the 36. Give the relation for Gamma noise • Where as Removal of image blur by applying a deblurrings function is considered
f(x,y)
noise component of the model. Gamma noise: a restoration technique
The PDF is
A system operator H, which together with an additive white noise term η(x,y) a P(Z)=ab zb-1 ae-az/(b-1) for Z>=0
30. Why the restoration is called as unconstrained restoration?
operates on an input image f(x,y) to produce a degraded image g(x,y). 0 for Z<0
In the absence of any knowledge about the noise ‘n’, a meaningful criterion
function is to seek an f^ such that H f^ approximates of in a least square sense by mean μ=b/a PART-B
[Link] the homogenity property in Linear Operator standard deviation σ2=b/a2
assuming the noise term is as small as possible. [Link] different mean filters
H[k1f1(x,y)]=k1 H[f1(x,y)] 37. Give the relation for Exponential noise
Where H = system operator. Arithmetic mean filter
The homogeneity property says that,the response to a constant multiple of Exponential noise
f^ = estimated input image. ƒ^(x,y)=1/mn Σ g(s,t)
any input is equal to the response to that input multiplied by the same constant. The PDF is
g = degraded image. (s,t)ЄSxy
P(Z)= ae-az Z>=0 Geometric mean filter
23. Give the relation for degradation model for continuous function 0 Z<0
31. Which is the most frequent method to overcome the difficulty to formulate the An image restored using a geometric mean filter is given by the expression
g(x,y) =-∞∫∞∫f(α,β)§(x-α,y-β).dαdβ+η(x,y) mean μ=1/a
spatial relocation of pixels? f^(x,y) = [ п g(s,t) ]
The point is the most frequent method, which are subsets of pixels whose location standard deviation σ2=1/a2 (s,t)ЄSxy
24. What is Fredholm integral of first kind?
in the input (distorted) and output (corrected) imaged is known precisely. here ,each restored pixel is given by the product of the pixels in the subimage
∞ 38. Give the relation for Uniform noise window , raised to the power 1/mn
g(x,y) = ∫∫f(α,β)h(x,α,y,β)dα dβ Uniform noise:

32. What are the three methods of estimating the degradation function? • Harmonic filters
1. Observation The PDF is The harmonic mean filtering operation is given by the expression
which is called the superposition or convolution or fredholm integral of first kind. It P(Z)=1/(b-a) if a<=Z<=b
2. Experimentation ƒ ^(x,y) = mn/∑(1/g(s,t))
states that if the response of H to an impulse is known, the response to any input f(α,β) 0 otherwise
3. Mathematical modeling. • Contra harmonic mean filter
can be calculated by means of fredholm integral. mean μ=a+b/2 Contra harmonic mean filtering operation yields a restored image based on the
33. What are the types of noise models? standard deviation σ2=(b-a)2/12 expression
25. Define circulant matrix
A square matrix, in which each row is a circular shift of the preceding row and the • Guassian noise Q+1 Q
• Rayleigh noise 39. Give the relation for Impulse noise f^(x,y)=∑g(x,t) /∑g(s,t)
first row is a circular shift of the last row, is called circulant matrix. Impulse noise:
• Erlang noise where Q is called the order of the [Link] filter is well suited for
The PDF is reducing or virtually eliminating the effect of salt and pepper noise.
• Exponential noise
P(Z) =Pa for z=a
he(o) he(M-1) he(M-2)………… he(1) • Uniform noise Pb for z=b
he(1) he(0) he(M-1)………. he(2) • Impulse noise 2. Draw the degradation model and explain.
0 Otherwise F(x,y)
He = .
. 34. Give the relation for guassian noise 40. What is inverse filtering? Degadation function + G(x,y) Restoration filters(s)
2
Noise e =E{(f-f^)2}
• H is a linear positive invariant process the the degraded image is given in the where E{.}is the expected value of the argument. Histogram Specfication
spatial domain by • It is assumed that the noise and the image are uncorrelated,that one or other has • Histogram equalization method does not lent itself to interactive application .
g(x,y)=h(x,y)*f(x,y)+ŋ(x,y) zero mean:and that the gray levels in the estimate are a linear function of the • Let Pr(r) and Pz(z) be the original and desired probability function. Suppose the
where h(x,y) is the spatial representation of the degradation function and the levels in the degradated image. histogram equalization is utilized on the original image
symbol “*” indicates spatial convolution. • Based on these conditions,the minimum of the error function in Eq is given in the S=T(r)=∫Pr(w) dw ………………………………….(5)
• The convolution in the spatial domain is equal to multiplication in the frequency frequency domain by the expression
domain.  This result is known as the wiener filter after [Link],who • Desired image levels could be equalized using the transformation function
• The equivalent frequency domain representation is proposed the concepts in the year [Link] filter which consists V=G(z)=∫Pr(w)dw ……………………………..(6)
of the term inside the brackets also is commonly referred to as the • The inverse process is, z=G-¹(v). Here Ps(s) and Pv(v) are identical uniform
G(u,v)=H(u,v)F(u,v)+N(u,v) minimum mean square error filter or the least square error filter. densities
Where the terms in capital letters are the Fourier transforms of the corresponding • We include references at the end of sources containing detailed derivations of the Z=G-¹(s)
terms in the previous equation. wiener filter. Assume that G-¹(s) is single-valued, the procedure can be summarized as follow
 The restored image in the spatial domain is given by the inverse 1. Equalize the level of the original image using eqn(4)
[Link] short notes on Median Filters Fourier transform of the frequency domain estimate F(u,v). 2. Specify the desired density function and obtain the transformation function G(z)
Introduction: • If the noise is zero,then the noise power spectrum vanishes and the wiener filter using eqn(6)
-Median Filter is one of the part of the smoothing filter. reduces to the inverse filter. 3. Apply the inverse transformation function Z=G-¹(s) to the level obtained in step
-No mask is used in the median filters.  However the power spectrum of the undegraded image seldom is 1.
-We choose 3x3 sub-image arranged in ascending order and leave first known. Where k is a specified constant. we can obtain result in combined transformation function
four values.  Example illustrates the power spectrum of wiener filtering over z=G-¹[T(r)] ……………………………..(7)
3 5 7 direct inverse [Link] value of K was chosen interactively to Histogram specification for digital image is limited one
2 10 20 yield the best visual results. Histrogram Equalization 1. First specify a particular histogram by digitizing the given function.
30 9 4 o It illustrates the full inverse filtered result similarly is the radially limited • Let the variable r represent the gray levels in the image to be enhanced. The pixel 2. Specifying a histogram shape by means of a graphic device whose output is fed
inverse filter . value are continous quantities normalized that lie in the interval [0,1] with r=0 into the processor executing the histogram specification algorithm.
2,3,4,5,7,9,10,20,30 represent black with r=1 represent white.
• These images are duplicated here for convenience in making comparisons.
-Take the median value. • The transformation of the form 2. Explain Spatial Filtering
-This median filter is the non-linear spatial filtering.  As expected ,the inverse filter produced an unusable [Link]
• S=T(r) …………………………………(1) • The use of spatial mask for the image processing usually called spatial filtering
1)median filtering smoothing noise in the inring filter.
• Which produce a level s for every pixel value r in the original [Link] satisfy and spatial mask are spatial filters.
2)Max filter • The wiener filter result is by no means perfect,but it does give us a hint as to
condition: • The linear filter classified into
3)Min filter image content.
o T(r) is the single-valued and monotonically increasing in the interval o Low pass
Max filter: • The noise is still quite visible, but the text can be seen
0≤r≤1 and o High pass
R=Max through a “curtain” of noise.
o 0≤T(r)≤1 for 0≤r≤1 o Band pass filtering
-Max filter gives the brightest points
PART-C  Condition 1 preserves the order from black to white in the gray • Consider 3*3 mask
Min filter:
R=Min [Link] Histogram processing scale.
-It helps to get the largest point in the image. • The Histogram of the digital image with gray levels in the range [0,L-1]is the  Condition 2 guarantees a mapping that is consistent with the W1 W2 W3
discrete function p(rk)=nk/n where rk is the kth gray level, nk is the number of allowed range of pixel values.
W4 W5 W6
pixel,n is the total number of pixel in the image and k=0,1,2,…….L-1. R=T-¹(s) 0≤s≤1 ………………………..(2)
4. Write short notes on Wiener Filtering. • P(rk) gives the an estimate probability of occurrence of gray-level rk.. Figure • The probability density function of the transformed graylevel is W7 W8 W9
• The inverse filtering approach makes no explicit provision for handling noise. show the the histogram of four basic types of images. Ps(s)=[pr(r)dr/ds] r=T-¹(s) …………………….(3)
• An approach that incorporate both the degradation function and statistical Figure: Histogram corresponding to four basic image types • Consider the transformation function
S=T(r)= ∫Pr(w)dw 0≤r≤1 …………………….(4) • Denoting the gray level of pixels under the mask at any location by
characteristics of noise into the restoration process.
Where w is the dummy variable of integration . z1,z2,z3……,z9, the response of a linear mask is
• The method is founded on considering images and noise as random processes,and
From Eqn(4) the derivative of s with respect to r is R=w1z1+ w2z2 +………..+w9z9
the objective is to find an estimate f of the uncorrupted image f such that the mean
square error between them is minimized. ds/dr=pr(r)
Substituting dr/ds into eqn(3) yields
• This error measure is given by
Ps(s)=[1] 0≤s≤1

Smoothing Filters [Link] transformations:- P(Z)=ab zb-1 ae-az/(b-1) for Z>=0


• Lowpass Spatial filtering: An image f of pixel coordinates(x,y) undergoes geometric distortion to 0 for Z<0
 The filter has to have positive coefficient. produce an image g with coordinates(x’,y’).this transformation may be expressed • The inverse (exponential) operation yields the desird enhanced image, denoted by mean μ=b/a
 The response would be the sum of gray levels of nine pixels which as g(x,y); that is, standard deviation σ2=b/a2
could cause R to be out of the gray level range. x’=r(x,y) Ln[f(x,y)] = ln[i(x,y) r(x,y)] Exponential noise
 The solution is to scale the sum by dividing R by [Link] use of the y’=s(x,y) F[ln(f(x,y))] = F[ln(i(x,y)]+F[ln( r(x,y))] The PDF is
form of mask are called neighborhood averaging • where r(x,y) and s(x’,y’) are the spatial transformations that produced the P(Z)= ae-az Z>=0
geometrically distorted image g(x’,y’). 0 Z<0
1 1 1 mean μ=1/a
• If r(x,y) and s(x,y) were known analytically recovering f(x,y) from the distorted • This method is based on a special case of a class of systems know as
1/9 1 1 1 image g(x’,y’) by applying the transformations in reverse might possible homomorphism systems. standard deviation σ2=1/a2
1 1 1 theoretically. • In this particular application,
• The method used most frequently to formulate the spatial relocation of pixels by  The key to the approach is the separation of the illumination and Uniform noise:
• Median filtering: the use of tiepoints,which are a subset of pixels whose location in the input and reflectance components achieved in the from. The PDF is
output image is known precisely.  The homomorphism filter function can then operate on these on P(Z)=1/(b-a) if a<=Z<=b
 To achive noise reduction rather than blurring.
• The vertices of the quadrilaterals are corresponding tiepoints. these component separately. 0 otherwise
 The gray level of each pixel is replaced by the median of the gray mean μ=a+b/2
level in the neighbourhood of that pixel  r(x,y)=c1x+c2y+c3xy+c4  The illumination components of an image generally is
standard deviation σ2=(b-a)2/12
Sharpening Filters  S(x,y)=c5x+c6y+c7xy+c8 characterized by slow spatial variations.
Impulse noise:
• Basic highpass spatial filtering:  x’=c1x+c2y+c3xy+c4  While the reflectance component tends to vary abruptly, The PDF is
 The filter should be positive ecoefficient near the center and  y’=c5x+c6y+c7xy+c8 particularly at the junction, while the reflectance component tends P(Z) =Pa for z=a
negative in the outer periphery. • Since there are a total of eight known tiepoints these equations can be solved for to vary abruptly, particularly at the junctions of dissimilar objects. Pb for z=b
 The sum of the coefficient are 0. eight coefficients ci,i=1,2,…8.  A good deal of control can be gained over the illumination and 0 Otherwise
 This eliminate the zero- frequency term reducing significantly the • The coefficient constitute the geometric distortion model used to transform all reflectance components with a homomorphic filter.
global contrast of the image pixels within the quadrilateral region defined by the tiepoints used to obtain the  This control requires specification of a filter function H(u.v) that UNIT I V
coefficients. affects the low - and high – frequency components of the Fourier
-1 -1 -1
• Tiepoints are established by a number of different techniques depending on the transform in different ways. 1. What is segmentation?
1/9* -1 8 -1 application. Segmentation subdivides on image in to its constitute regions or objects. The level
-1 -1 -1 2. Gray level Interpolation:- [Link] the different Noise Distribution in detail. to which the subdivides is carried depends on the problem being solved .That is
• High_boost filtering: • Depending on the values of coefficients ci equations can yield noninteger values Introduction: segmentation should when the objects of interest in application have been isolated.
The definition is for x’ and y’. • Noise are unwanted signal which corrupts the original signal.
High-boost=(A)(Original)-Lowpass • Because the distorted image g is digital ,its pixel values are defined only at integer • Origin of noise source is during image acquisition and/or transmission and 2. Write the applications of segmentation.
=(A-1) (Original)+ Original –Lowpass co ordinates . digitization. • Detection of isolated points.
=(A-1) (Original)+Hignpass • Thus using non integer values for x’, y’ causes a mapping into locations of g for • During capturing ,performance of imaging sensors are affected by the • Detection of lines and edges in an image.
• Derivative Filters: which no gray levels are defined. environmental conditions due to the quality of sensors.
 Averaging is anlog to integration , differentiation can be expected • The technique is used to accomplish this is called gray level interpolation. • Image acquisition are the principle source of noise. 3. What are the three types of discontinuity in digital image?
to have opposite effect and thus sharpen the image • Due to the interference in the transmission it will affect the transmission of the Points, lines and edges.
[Link] homomorphic filtering image.
• The illumination – reflectance model can be used to develop a frequency domain • Types: 4. How the derivatives are obtained in edge detection during formulation?
3. Explain the Geometric Transformations used in image restoration. procedure for improving the appearance of an image by simultaneous gray – level Rayleigh noise: The first derivative at any point in an image is obtained by using the magnitude of
• Geometric transformations are used for image restoration, modify the spatial compression and contrast enhancement. The PDF is the gradient at that point. Similarly the second derivatives are obtained by using the
relationship between the pixels in an image. • An image can be expressed as the product of illumination and reflectance P(Z)= 2(z-a)e-(z—a)2/b/b for Z>=a laplacian.
• Geometric transformations are often called rubber sheet transformations, because components. 0 for Z<a
they are may be viewed as the process of printing an image on a sheet of rubber. f(x,y) = i(x,y) r(x,y) 5. Write about linking edge points.
• The geometric transformations consists of two basic operations: F(f(x,y)) = F(i(x,y) r(x,y)) mean μ=a+√πb/4 The approach for linking edge points is to analyze the characteristics of pixels in a
(1) Spatial transformation standard deviation σ2=b(4-π)/4 small neighborhood (3x3 or 5x5) about every point (x,y)in an image that has undergone
(2) Gray level interpolation Where F i(u,v)) and F(r(u,v)) are the Fourier transformation of i(x,y)and r(x,y) edge detection. All points that are similar are linked, forming a boundary of pixels that
respectively. Gamma noise: share some common properties.
The PDF is

6. What are the two properties used for establishing similarity of edge pixels? 14. Define pattern. • Based on the first category ,the approach is based on abrupt changes in gray level The Laplacian
(1) The strength of the response of the gradient operator used to produce the edge A pattern is a quantitative or structural description of an objective or some other entity and the areas of interest based on this category are detection of isolated points • The laplacian of a 2-D function f(x,y) is a second order derivatives defined as
pixel. of interest in an image, and detection of lines and edges in an image. ▼²ƒ=∂²ƒ/∂²x+∂²ƒ/∂²y
(2) The direction of the gradient. • Based on the second category the approach is based on thresholding, region • The first laplacian is combined with smoothing as a precursor to finding edges via
15. Define pattern class. growing and region splitting and merging . zero crossings. Consider the function.
7. What is edge? A pattern class is a family of patterns that share some common properties .Pattern • The concept of segmenting an image based on discontinuity or similarity of the ▼² ƒ=8z5-(z1+z2+z3+z4+z6+z7+z8+z9)
An edge isa set of connected pixels that lie on the boundary between two regions classes are denoted w1,w2,----wm, where M is the number of classes . gray level values of its pixels is applicable to both static and dynamic images. 0 -1 0
edges are more closely modeled as having a ramplike profile. The slope of the ramp is -1 4 -1
inversely proportional to the degree of blurring in the edge. [Link] the three pattern arrangements. 2. Write short notes on edge detection 0 -1 0
Vectors Edge Detection:
8. Give the properties of the second derivative around an edge? Strings • Edge detection is “local” image processing methods designed to detect edge
• The sign of the second derivative can be used to determine whether an edge pixel Treestching [Link] Short notes on edge linking by local processing.
pixels. • One of the simplest approaches f or linking edge points is to analyze the
lies on the dark or light side of an edge. • Concept that is based on a measure of intensity-level discontinuity at a point.
• It produces two values for every edge in an image. 17. Give the decision theoretic methods. characteristics of the pixels in a small neighborhood about every point in an
• It is possible to link edge points into edge segments, and sometimes these image that has undergone edge detection.
• An imaginary straightline joining the extreme positive and negative values of the Matching-Matching by minimum distance classifier
segments are linked in such a way that they correspond to boundaries, but this is • Two properties used for establishing similarity of edge pixels in the analysis are
second derivative would cross zero near the midpoint of the edge. Matching by correlation
not always the case.  The strength of the response of the gradient operator used to
The image gradient and its properties: produce the edge pixel,
9. Define Gradient Operator? 18. Define training pattern and training set.
The patterns used to estimate the parameters are called training patterns,anda set of such • The tool of choice for finding edge strength and direction at location (x,y) of an  The direction of the gradient.
First order derivatives of a digital image are based on various approximation of image, f, is the gradient, denoted by ▼ƒ, a and defined as the vector
the 2-D gradient. The gradient of an image f(x,y) at location(x,y) is defined as the vector patterns from each class is called a training set. The first property is given by the value of ▼f.
gx ∂ƒ/∂x Thus an edge pixel with coordinates (x’,y’) and in the predefined neighborhood of (x,y) is
Magnitude of the vector is
∆f=mag( ∆f )=[Gx2+ Gy2]1/2 19. Define training similar in magnitude to the pixel at (x,y) if |▼f(x,y) - ▼(x’,y’)|<=T where T is a
The process by which a training set is used to obtain decision functions is called ▼ƒ≡grad(ƒ)≡ gy = ∂ƒ/∂y nonnegative threshold.
∞(x,y)=tan-1(Gy/Gx)
∞(x,y) is the direction angle of vector ∆f learning or training. The direction of the gradient vector is given by
• The magnitude length of vector ▼ƒ, denoted as M(x,y) α(x,y)=tanˉ¹ gy/gx
20. What are the layers in back propagation network? M(x,y)=mag(▼ƒ)=√gx²+gy² Then an edge pixel at (x’,y’) in the predefined neighborhood of (x,y) has an angle similar
10. What is meant by object point and background point? Is the value of the rate of change in the direction of the
To execute the objects from the background is to select a threshold T that Input layer, Hidden layer and out put layer to the pixel at (x,y) if | α(x,y)= α(x’,y’)|<A where A is an angle threshold. Note that the
gradient vector. direction of the edge at (x,y) in reality is perpendicular to the direction of the gradient
separate these modes. Then any point (x,y) for which f(x,y)>T is called an object point.
PART –B • The direction of the gradient vector is given by the angle vector at that point.
Otherwise the point is called background point.
1. Write short notes on image segmentation. α(x,y)=tanˉ¹ gy/gx A point in the predefined neighborhood of (x,y) is linked to the pixel at (x’,y’) if both
11. What is global, Local and dynamic or adaptive threshold? • Segmentation subdivides on image in to its constitute regions or objects. The level magnitude and direction criteria are satisfied. This process is repeated for every location
to which the subdivides is carried depends on the problem being solved . measured with respect to the x-axis. in the image. A record must be kept of linked points as the center of the neighborhood is
When Threshold T depends only on f(x,y) then the threshold is called global . If T
depends both on f(x,y) and p(x,y) is called local. If T depends on the spatial coordinates x • Examples: In autonomous air to ground target acquisition applications identifying • Follows, using these differences as our estimates of the partials, that ∂ƒ/∂x=-2 and moved from pixel to pixel. A simple bookkeeping procedure is to assign a different gray
and y the threshold is called dynamic or adaptive where f(x,y) is the original image. vehicles on a road is of interest. ∂ƒ/∂y=2 at the point in equation. Then level to each set of linked edge pixels.
• The first step is to segment the road from the image and then to segment the gx ∂ƒ/∂x -2
12. Define region growing? elements of the road down to objects of a range of sizes that correspond potential ▼ƒ= = = [Link] short notes on the applications of artificial neural networks in image
Region growing is a procedure that groups pixels or subregions in to layer regions vehicles. gy ∂ƒ/∂y 2 processing.
based on predefined criteria. The basic approach is to start with a set of seed points and • In target acquistition ,the system designer has n control of the environment.
from there grow regions by appending to each seed these neighbouring pixels that have from which we obtain M(x,y)=2√2 at that point. The real-time automatic images processing and pattern recognition are very important for
• So the usual approach is to focus on selecting the types of sensors most likely to Gradient operators:
properties similar to the seed. enhance the objects of interest . many problems in medicine, physics, geology, space research, military applications and
• Obtaining the gradient of an image requires computing the partials derivatives so on. For example, it is necessary for pilots and drivers for immediate decision-making
• Example is the use of infrared imaging to detect objects with a strong heat ∂ƒ/∂x and ∂ƒ/∂y at every pixel location in the image.
13. Specify the steps involved in splitting and merging? in poor visibility conditions. An approach to image enhancement through artificial neural
signature,such as tanks in motion.
Split into 4 disjoint quadrants any region Ri for which P(Ri)=FALSE. network’s (ANN) processing is [Link] is for images enhancement through
• Segmentation algorithms for monochrome images are based on one of the two gx=∂ƒ(x,y)/ ∂x= ƒ(x+1,y)- ƒ(x,y) approximation of image transform function T. This function is approximated with use of
Merge any adjacent regions Rj and Rk for which P(RjURk)=TRUE. basic properties of gray level values . They are discontinuity and similarity.
Stop when no further merging or splitting is positive. gy=∂ƒ(x,y)/ ∂y= ƒ(x,y+1)- ƒ(x,y) ANN which is trained evolutionary in the time of test images processing. Each ANN is
• An approach used frequently is to approximate the gradient by absolute value: genetically encoded as the list of its connections. Truncation selection is used for parental
▼ƒ≈‫׀‬Gx‫‌׀׀‬+‌ ‫׀‬Gy subpopulation formation. Original crossover and mutation operators, which respect
structures of the ANNs undergoing recombination and mutation, are used. Nodes with

sigmoid activation functions are considered. The population size adapts to the properties The objective of segmentation is to partition an image into regions. We We introduced a simple model in which an image f(x,y) is formed as the
of evolution during the algorithm run using simple resizing strategy. In this application approached this problem by finding boundaries between based on discontinuities in gray product of a reflectance component r(x,y) and an illumination components i(x,y).
pixel-by-pixel brightness processing with use of ANN paradigm is adopted. The topology levels, segmentation was accomplished via thresholds based on the distribution of pixels consider the computer generated reflectance function.
of ANN is tuned simultaneously with connections weights. The ANN approximating T properties, such as gray level values or color.  The histogram of this function is clearly bimodal and could be portioned easily by
function should have three input nodes and one output node. During the training we Basic Formulation: placing a single global threshold, T, in the histogram valley.
evaluate each ANN with respect to the visual quality of the processed images. Let Represent the region of image. We may view segmentation as a process that  Multiplying the reflectance function by the illumination function.
partition R into n subregions,R1,R2,………………,such that  Original valley was virtually eliminated, making segmentation by a single
The three-step procedure for image enhancement is proposed: n threshold an impossible task.
(1) multiplicative adjustment of image brightness (a) ΣRi=R  Although we seldom have the reflectance function by itself to work with, this
(2) local level processing using ANN; i=1 Fig: Structure of a basic neuron
simple illustration shows that the reflective nature of objects and background can
(3) global level auto smoothing algorithm. (b) Ri is a connected region, i=1,2,…………..n. be such that they are separable.
(c) Ri∩Rj=Фfor all i and j,i≠j. ƒ(x,y)=i(x,y)r(x,y)
The artificial neural network training stage with use of single 128х128 pixels image takes (d) P(Ri)=TRUE for i=1,2,……………………n. Back propagation training algorithm:
Taking the natural logarithm of this equation yields a sum: Basic neuron: Figure shows the Structure of a basic neuron. A set of inputs applied
about 70 seconds on the Intel Pentium IV 3 GHz processor. After completion of the (e) P(RiURj)=FALSE for i≠j. z(x,y)=ln ƒ(x,y)
learning process the obtained artificial neural network is ready to process either from the outside of from a previous layer. Each of these is multiplied by a weight
=ln i(x,y)+ln r(x,y) and the products are summed. This summation of this products is turned NET and must
arbitrary images that were not presented during the training. The processing time for Here, P(Ri) is a logical predicate defined over the points in set Ri and Ф is the null set. =i (x,y)+r (x,y)
512х512 pixels image is about 0.25 second. The ANN, as a rule, included 3 input nodes,  Condition (a) indicates that the segmentation must be complete that is every pixel must be calculated for each neuron in the network. After NET is calculated an activation
 If i (x,y) and r (x,y) are independent random variable, the histogram of z(x,y) is function F is applied to modify it, thereby producing the signal out
one or more hidden nodes and one output node. be in a region. given by the convolution of the histogram of i (x,y) and r (x,y).
 Condition (b) requires that points in a region must be connected in some predefined Where,
 But if i (x,y) had a border histogram the convolution process would smear the NET=x1w1+x2w2+……………………xnwn
sense. histogram of r (x,y), yielding a histogram for z(x,y) whose shape could be quite n
 Condition(c) indicates that the regions must be disjoint. different from that of the histogram of r (x,y). NET=Σxiwi
 Condition(d) deals with the properties that must be satisfied by the pixels in a segmented  The degree of distortion depends on the broadness of the histogram of i (x,y), i=1
region. which in turn depends on the nonuniformity of the illumination function. and out=F(NET)
Region Growing:  We have dealt with the logarithm of ƒ(x,y), instead of dealing with the image
As its name implies region growing is a procedure that groups pixel or subregions function directly.
into larger regions based on predefined criteria. The basic approach is to start with a set  When access to the illumination source is available, a solution frequently used in
of “seed” points and from these grow regions. practice to compensate for nonuniformity is to project the illumination pattern
 If the result of these computation shows clusters of values, the pixels whose properties onto a constant, white reflective surface.
place them near the centroid of these clusters can be used as seeds.  This yields an image g(x,y)=ki(x,y), where k is a constant that depends on the
 Descriptors alone can yield misleading results if connectivity or adjacency information is surface and i(x,y) is the illumination pattern.
not used in the region growing process.  For any image ƒ(x,y)=i(x,y)r(x,y) obtained from the same illumination function,
Region Splitting and Merging: simply dividing ƒ(x,y) by g(x,y) yields a normalized function h(x,y)=
The procedure just discussed grows regions from a set of seed points. An ƒ(x,y)/g(x,y)= r(x,y)/k.
alternative into subdivided an image initially into a set of arbitrary, disjointed regions and
 Thus, if r(x,y) can be segmented by using a single threshold T, then h(x,y) can be
then merge and/or split the regions in an attempt to satisfy the conditions.
segmented by using single threshold of value T/k.
R1 R2

R3 [Link] Back propogation neural networks in detail


R41 R42
R43 R44
1. Split into four disjoint quadrants any region Ri for which P(Ri)=FALSE.
Fig: Back propagation neural network
2. Merge any adjacent regions Rj and Rk for which P(RjURk)=TRUE.
3. Stop when no further merging or splitting is possible.
Mean and standard deviation of pixels in a region to quantify the texture of region.
PART-C Multilayer Back Propagation Network:
Role of thresholding:
Figure shows the Back propagation neural network .The first set of neurons
[Link] region oriented segmentation in detail serve only as distribution points. They perform no input summation. The input signal is
simply passed into the weight on there outputs. Each neuron in subsequent layers
produces NET and OUT signals as described above. A neuron is associated with the set Because the petals of all flowers vary in width and length to some degree the pattern 3. What are two main types of Data compression? 11. Define encoder
of weights that connects to its input. The input of distribution layer is designated as layer vectors describing three flowers also will vary, not only between different classes ,but • Lossless compression can recover the exact original data after compression. It is Source encoder is responsible for removing the coding and interpixel redundancy
0. Backpropagation can be applied to network with a number of layers. However only also with in a class. used mainly for compressing database records, spreadsheets or word processing and psycho visual redundancy.
two layers of weights are needed to demonstrate the algorithm. The result of this classic feature selection problem shows that the degree of class files, where exact replication of the original is essential. There are two components
Steps involved in backpropagation training: seperability depends strongly on the choice of pattern measurements selected for an • Lossy compression will result in a certain loss of accuracy in exchange for a A) Source Encoder
 Select the next training pair from the training set. Apply input vector to the network application. substantial increase in compression. Lossy compression is more effective when B) Channel Encoder
input. used to compress graphic images and digitised voice where losses outside visual
 Calculate the output of the network. or aural perception can be tolerated. 12. Define source encoder
 Calculate the error between the output network and the desired output. 4. What is the need for Compression? Source encoder performs three operations
 Adjust the weight of the network in a way that minimizes the error. In terms of storage, the capacity of a storage device can be effectively increased with 1) Mapper -this transforms the input data into non-visual format. It reduces the
 Repeat steps1 through 4 for each vector in the training until the error for the entire set methods that compress a body of data on its way to a storage device and decompresses interpixel redundancy.
acceptably low. it when it is retrieved. 2) Quantizer - It reduces the psycho visual redundancy of the input images .This
Forward pass: In terms of communications, the bandwidth of a digital communication link can be step is omitted if the system is error free.
Step 1 and 2 can be expressed in vector form as follows: an input vector x is effectively increased by compressing data at the sending end and decompressing data at 3) Symbol encoder- This reduces the coding redundancy .This is the final stage of
applied and the output vector y is produced. Calculations in multilayer network are done the receiving end. encoding process.
layer by layer starting at the layer nearest to the inputs. The NET value of each neuron in At any given time, the ability of the Internet to transfer data is fixed. Thus, if data can
the first layer is calculated as the weight sum of its neurons input. The activation function effectively be compressed wherever possible, significant improvements of data 13. Define channel encoder
F then ‘squashes’ NET to produce the OUT value for each neuron in that layer. Once a throughput can be achieved. Many files can be combined into one compressed document The channel encoder reduces reduces the impact of the channel noise by inserting
set a output for a layer is found, it serves as input to the next layer. This process is making sending easier. redundant bits into the source encoded data.
repeated layer by layer until the final set of output network is produced. 5. What are different Compression Methods? Eg: Hamming code
Reverse bias: Run Length Encoding (RLE)
Because a target value is available for each neuron in the output layer, Arithmetic coding 14. What are the types of decoder?
adjusting ht associated weights id easily accomplished as a modification of the delta rule. Huffman coding and Source decoder- has two components
Interior layers are referred to as hidden layers as their outputs has no target values for Transform coding a) Symbol decoder- This performs inverse operation of symbol encoder.
comparison. Hence the training is more complicated. Hidden layers have no target 6. Define is coding redundancy? b) Inverse mapping- This performs inverse operation of mapper.
vectors so the ordinary training process described above can not be used. If the gray level of an image is coded in a way that uses more code words than Channel decoder-this is omitted if the system is error free.
necessary to represent each gray level, then the resulting image is said to contain coding
redundancy. 15. What are the operations performed by error free compression?
[Link] in detail on pattern and pattern classes. 1) Devising an alternative representation of the image in which its interpixel
7. Define interpixel redundancy? redundant are reduced.
A pattern is a quantitative or structural description of an objective or some other entity The value of any given pixel can be predicted from the values of its neighbors. 2) Coding the representation to eliminate coding redundancy
of interest in an image, The information carried by is small. Therefore the visual contribution of a single pixel to
an image is redundant. Otherwise called as spatial redundant geometric redundant or 16. What is Variable Length Coding?
A pattern class is a family of patterns that share some common properties .Pattern interpixel redundant. Variable Length Coding is the simplest approach to error free compression. It
classes are denoted w1,w2,----wm, where M is the number of classes Eg: Run length coding reduces only the coding redundancy. It assigns the shortest possible codeword to the most
Three principle pattern arrangements used in practice are vectors(for quantitative probable gray levels.
descriptors ) and strings and trees (for structural descriptions) . 8. What is run length coding?
Patternvectors are represented by bold lower case letters such as x,y, and z,where UNIT V Run-length Encoding, or RLE is a technique used to reduce the size of a repeating 17. Define Huffman coding
Each component x represent the ith [Link] vectors are represented in string of characters. This repeating string is called a run; typically RLE encodes a run of • Huffman coding is a popular technique for removing coding redundancy.
coloumns (i.e. n x 10 marices) or in the equilant form x=9x1,x2,------xn)T,T-transpose. 1. What is image compression? symbols into two bytes, a count and a symbol. RLE can compress any type of data • When coding the symbols of an information source the Huffman code
The nature of the pattern vector depends on the measurement technique used to describe Image compression refers to the process of redundancy amount of data required to regardless of its information content, but the content of data to be compressed affects the yields the smallest possible number of code words, code symbols per
the physical pattern itself. represent the given quantity of information for digital image. The basis of reduction compression ratio. Compression is normally measured with the compression ratio: source symbol.
Ex. If we want to describe the three types of iris floers(iris setosa,virginica,and process is removal of redundant data. 9. Define compression ratio. 18. Define Block code
versicolor)by measuring the width and length of the [Link] is represented in the vector Compression Ratio = original size / compressed size: 1 Each source symbol is mapped into fixed sequence of code symbols or code
form x=[x1,x2]T;x1,x2 correspond to width length [Link] pattern classes are 2. What is Data Compression? 10. Define psycho visual redundancy? words. So it is called as block code.
w1,w2,w3 corresponding to the three verities. Data compression requires the identification and extraction of source redundancy. In In normal visual processing certain information has less importance than other
other words, data compression seeks to reduce the number of bits used to store or information. So this information is said to be psycho visual redundant. 19. Define instantaneous code
transmit information.

A code word that is not a prefix of any other code word is called instantaneous or The Major Steps in JPEG Coding involve: 31. Define I-frame
prefix codeword.  DCT (Discrete Cosine Transformation) I-frame is Intraframe or Independent frame. An I-frame is compressed 1. Coding Redundancy :
 Quantization independently of all frames. It resembles a JPEG encoded image. It is the reference point We developed this technique for image enhancement by histogram processing on the
 Zigzag Scan for the motion estimation needed to generate subsequent P and P-frame. assumption that the grey levels of an image are random quantities. Here the grey level
20. Define arithmetic coding  DPCM on DC component histogram of the image also can provide a great deal of insight in the construction of
In arithmetic coding one to one corresponds between source symbols and code  RLE on AC Components 32. Define P-frame codes to reduce the amount of data used to represent it.
word doesn’t exist where as the single arithmetic code word assigned for a sequence of P-frame is called predictive frame. A P-frame is the compressed difference 2. Interpixel Redundancy :
 Entropy Coding
source symbols. A code word defines an interval of number between 0 and 1. between the current frame and a prediction of it based on the previous I or P-frame Inorder to reduce the interpixel redundancy in an image, the 2-D pixel array normally
27. What is MPEG?
used for human viewing and interpretation must be transformed in to more efficient form.
The acronym is expanded as "Moving Picture Expert Group". It is an international
21. What is bit plane Decomposition? 33. Define B-frame
standard in 1992. It perfectly Works with video and also used in teleconferencing
An effective technique for reducing an image’s interpixel redundancies is to B-frame is the bidirectional frame. A B-frame is the compressed difference 3. Psychovisual Redundancy:
28. Draw the JPEG Encoder.
process the image’s bit plane individually. This technique is based on the concept of between the current frame and a prediction of it based on the previous I or P-frame or Certain information simply has less relative importance than other information in the
decomposing multilevel images into a series of binary images and compressing each next P-frame. Accordingly the decoder must have access to both past and future reference normal visual processing. This information is called Psycovisual Redundant.
binary image via one of several well-known binary compression methods. frames.
2. Explain the Coding phase in JPEG
22. Draw the block diagram of transform coding system
In this approach the lable for the DC and AC coefficient are coded differently using
Symbol Compressed Huffman codes. The DC coefficient values partitioned into categories. The categories are
Input image Wavelet transform Quantizer PART B
encoder image then Huffman coded. The AC coefficient is generated in slightly different manner. There
are two special codes: End-of-block(EOF) and ZRL

Table: Coding of the differences of the DC labels


29. Draw the JPEG Decoder. 1. Define Compression and explain data Redundancy in image
compression 0 0
Symbol Inverse wavelet Decompressed 1 -1 1
Compressed image decoder transform image Compression: It is the process of reducing the size of the given data or an image. It will 2 -3 -2 2 3
help us to reduce the storage space required to store an image or File. 3 -7 ……………………. -4 4 ……… 7
……………………………………….
Data Redundancy: Table: sample table for obtaining the Huffman code for a given label value and run length
The data or words that either provide no relevant information or simply
23. How effectiveness of quantization can be improved? restate that which is already known .It is said to be data redundancy. Z/C Codeword Z/c Codeword ….. Z/C Codeword
• Introducing an enlarged quantization interval around zero, called a dead 30. What is zig zag sequence?
The purpose of the Zig-zag Scan: 0/0 1010 F/0 11111111001
zero. Consider N1 and N2 number of information carrying units in two data sets 0/1 00 1/1 1100 F/1 111111111111110101
• Adapting the size of the quantization intervals from scale to scale. In  To group low frequency coefficients in top of vector. that represent the same information ……………………
either case, the selected quantization intervals must be transmitted to the  Maps 8 x 8 to a 1 x 64 vector
decoder with the encoded image bit stream. Data Redundancy Rd = 1-1/Cr
To encode the AC coefficient First using Zigzag scan. We obtain
24. What are the coding systems in JPEG? Where Cr is called the Compression Ratio. -9 3 0 0 0 0 0 ……… 0
1. A lossy baseline coding system, which is based on the DCT and is The first value belong to category 1. transmit the code corresponding to 0/1 follow by a
adequate for most compression application. ` Cr=N1/N2. single bit 1 to indicate that the value being transmitted is 1 and not -1 .Simillarly other
2. An extended coding system for greater compression, higher AC coefficient code are transmited.
precision or progressive reconstruction applications. Types of Redundancy To obtain the reconstruction of the original block Dequantization is performed and taking
3. a lossless independent coding system for reversible compression. inverse transform of the coefficient we get the reconstructed block
25. What is JPEG? There are three basic Redundancy and they are classified as
The acronym is expanded as "Joint Photographic Expert Group". It is an international 1) Coding Redundancy [Link] Huffman coding with an example.
standard in 1992. It perfectly Works with color and grayscale images, Many applications 2) Interpixel Redundancy
e.g., satellite, medical,... 3) Psychovisual Redundancy. • This technique was developed by David Huffman.
26. What are the basic steps in JPEG?

• The codes generated using this technique or procedure are called Huffman codes. Efficiency = η = Entropy/average length EZW coder was introduced by Shapiro. It is a quantization and coding strategy that
• These codes are prefix codes and are optimum for a given model. =0.284% characteristics of the wavelet [Link] particular characteristic used by the
EZW algorithm is that there are wavelet coefficients in different subbands that represent
The Huffman procedure is based on two observations regarding Find redundancy the same spatial location in the image.
optimum prefix codes Source In 10-band decomposition ,the coefficient a in the upper-left corner of band I
[Link] an optimum code, symbols that occur more frequently will have Redundancy =Re =1-η = 0.716 Source Channel Channel Channel
Decode represents the same spatial location as coefficient a1 represents the same spatial location
Encoder Encoder Decoder
shorter code words than symbols that occur less frequently. r as coefficients a11,a12,a13,a14 in band V. Each of these pixel represents the same spatial
[Link] an optimum code ,the two symbols that occur least frequently will location as four pixels in band VIII.
have the same length PART -C

Design of a Huffman Code 1) Define Compression and Explain the general compression system model The Source Encoder Will removes the input redundancies. The channel a
Compression: It is the process of reducing the size of the given data or an image. It will encoder will increase the noise immunity of the source encoder’s output. If the channel II
To design a Huffman code ,we first sort the letters in descending probability help us to reduce the storage space required to store an image or File. between encoder and decoder is noise free then the channel encoder and decoder can be V
omitted.
Image Compression Model: a3 a4
Find the Huffman code for the following:
P(A)=0.2, p(B)=0.1, p(C)=0.2,p(D)=0.05,p(E)=0.3,p(F)=0.05,p(G)=0.1
There are two Structural model and they are broadly Classified as follows
1. An Encoder VI VII
2. A Decoder. Quantize I
Mapper Symbol
r Encoder

Encoder Channel
Decoder

MAPPER:
It transforms the input data in to a format designed to reduce the interpixel redundancy in
the input image.
An Input image f(x,y) is fed in to encoder and create a set of symbols and after
transmission over the channel ,the encoded representation is fed in to the decoder. QUANTIZER:
It reduce the accuracy of the mapper’s output.

A General Compression system model: SYMBOL ENCODER: We can visualize the relationships of these coefficients in form of tree:The
It creates a fixed or variable length code to represent the quantizer’s output coefficient a forms the root of the tree with three descendants a1,a2,a3.
The General system model consist of the following components,They are and maps the output in accordance with the code. EZW algorithm is a multiple pass algorithm,with each pass consisting of two
broadly classified as steps.
Find the average length and entropy
1. Source Encoder [Link] map encoding or the dominant pass
2. Channel Encoder [Link] or the subordinate pass
Average length =I= k =1 ∑ M
p(ak)l(ak) 3. Channel If Cmax is the value of the largest coefficient,the initial value of the threshold T0 is
I=3(0.2)+3(0.1)+3(0.2)+5(0.05)+1(0.3)+5(0.05)+4(0.1) 4. Channel Decoder Symbol Inverse given by
=2.7 bits/symbol 5. Souce Decoder decoder mapper [Log Cmax]
T0=2
∑ M SYMBOL DECODER:
Entropy =H(ak)= - k =1 p(ak)log2p(ak) The inverse operation of the source encoder’s symbol will be performed This selection guarantees that the largest coefficient will lie in the interval[T0,2T0].In
=0.7667 and maps the blocks. each pass,the threshold Ti is reduced to half the value it had in the previous pass:

Ti=1/2(Ti-1)
Find Efficiency [Link] the concepts of Embedded Zero Tree coding
For given value of Ti,we assign one of four possible labels to the coefficients:

-Simple
[Link] positive(sp) -Main Use:
[Link] negative(sn) -Snr-scalable MPEG7: One application for which vector quantizer has been extremely popular is image
[Link] root(zr) -Spatially scalable -Focus on the development of a multimedia content description interface seems to be compression.
[Link] zero(iz) -High somewhat removed from the study of data compression
-Simple profile uses the [Link] removal of the Bframes makes the requirements -These activities relate to the core principles of data compression which is the Disadvantage of vector quantization:
The coefficients labeled significant are simply those that fall in the outer levels of the simpler. development of compact descriptions of information Vector quantization applications operate at low rates. For applications such as
quantized and are assigned an initial reconstructed value of 1.5Ti or -1.5Ti,depending on MPEG 4 high-quality video coding, which requires higher rates this is definitely a problem.
whether the coefficient is positive or negative. -Provides a more abstract approach to the coding of [Link] standard views the 4. Discuss about Vector quantization procedure in detail To solve these problems, there are several approaches which entails some
multimedia scene as a collection of [Link] objects can be coded independently. structure in the quantization process
[Link] MPEGcompression standard -Language called the binary format for scenes based on the virtual reality modeling Source Encoder Decoder Reconstruction
language has been developed by Mpeg. output Tree structures vector quantizers:
Introduction: -The protocol for managing the elementary streams and their multiplexed version called Group
Find closest Table Unblock This structure organizes codebook in such a way that it is easy to pick which part
-The basic structure of the compression algorithm proposed by mpeg is very similar to the delivery multimedia integration framework is a part of Mpeg4 code-vector lookup contains the desired output vector
into
that of ITU-T H.261 -The different objects that makeup the scene are coded and sent to the multiplexer vectors
-In mpeg the blocks are organized in macro blocks which are defined in the same manner -The information about the presence of these objects is also provided to the motion Structured vector quantizers:
as that of H.261 algorithm compensator predictor Tree-structured vector quantizer solves the complexity problem, but acerbates the
-The mpeg standard initially had applications that require digital storage and retrieval as a -It is also used in facial animation controlled by facial definition parameter storage problem
major focus -It allows for object scalability. We now take entirely different tacks and develop vector quantize that do not have
these storage problems; however we pay for this relief in other ways
Frames
I-Frames Motion Video 5. Explain Arithmetic coding with an example
DCT Q
-Mpeg includes some frames periodically that are coded without any reference to the past texture Multiplex Algorithm Implementation
Frames. These frames are called I-frames coding Sequence being encoded as:
-I-frames do not use temporal correlation for [Link] the number of frames ln=ln-1+(un-1-ln-1)fx(xn-1)
between two consecutive Iframes is a trade-off between compression efficiency and un= ln-1+(un-1-ln-1)fx(xn)
convenience. Q-1 n becomes larger values gets closer and closer together. The intervals becomes narrower,
In vector quantization we group the source output into blocks or vectors. This vector of
P and B frames source outputs forms the input to the vector quantizer. At both the encoder and decoder of there are 3 possibilities
-In order to improve the compression efficiency mpeg1 algorithm contains two other the vector quantizer, we have a set of L-dimensional vectors called the codebook of the 1. the interval is entirely confined to the lower half of the unit interval [0,0.5)
types of frames: predictive coded and bidirectionally predictive coded frames Inverse vector quantizer. The vectors in this codebook are known as code-vectors. Each code 2. the interval is entirely confined to the upper half of the unit interval [0.5,1)
-Generally the compression efficiency of P-frames is substantially higher than Iframes DCT vector is assigned a binary index. 3. the interval is in the midpoint of the unit interval
Anchor frames At the encoder, the input vector is compared to each code-vector in order to find
-The I and P frames are sometimes are anchor frames the code vector closest to the input vector We want to have the sub interval (tag) in the full [0,1) interval
-To compensate for the reduction in the amount of compression due to the frequent use of In order to inform the decoder about which code vector was found to be the E1:[0,0.5) E1(x)=2x
Iframes the mpeg standard introduced Bframes closest to the input vector, we transmit or store the binary index of the code-vector. E1:[0.5,1) E1(x)=2(x-0.5)
Group of pictures(GOP) Predictor 1 Frame Because the decoder has exactly the same codebook, it can retrieve the code vector This process of generating the bits if the tag without waiting to see the entire sequence is
-GOP is a small random access unit in the video sequence switch store Although the encoder have to perform considerable amount of computations in called incremental encoding
-The GOP structure is set up as a tradeoff between the high compression efficiency of Predictor 2 order to find the closest reproduction vector to the vector of source outputs, the decoding
-Motion compensated coding and the coding and the fast picture acquisition capability of consists of a table lookup. This makes vector quantization a very attractive encoding Tag generation with scaling
periodic intra-only processing scheme for applications in which the resources available for decoding are considerably Eg: A={a1,a2,a3} P(a1)=0.8 P(a2)=0.02 P(a3)=0.18 Fx(1)=0.8 Fx(2)=0.82 Fx(3)=1
Predictor 3
-The format for mpeg is very flexible however the mpeg committee has provided some less than the resources available for encoding Encode the sequence 1321
suggested value for the various parameters
-For Mpeg 1 these suggested values are called the constraint parameter bitstream Advantages of vector quantization over scalar quantization Solution:
MPEG2 For a given rate (bits per sample), use of vector quantization results in lower distortion • first element is 1
-It takes a toolkit approach providing a number of subsets each containing different Shape
than when scalar quantization is used at the same rate Initialize u0=1 l0=0
ion
estimat
n
Motio

options coding
Vectors of source output values tend to fall in clusters. By selecting the quantizer l1=0+(1-0)0=0
-For a particular application the user can select from a set of profiles and levels output points to lie in these clusters, we have more accurate representation of the source u1=0+(1-0)0.8=0.8
Types of profiles output The interval [0,0.8) is either in the upper or the lower half of unit interval so proceed
• Second element 3 JPEG is a transform coding approach using DCT. Consider 8*8 block of the Where Qij is the (i,j)th element of the quantization table. The reconstructed value is • For facsimile images, p(w/w) and p(w/b) are generally significantly higher than
l2=0+(0.8-0)0.82=0.656 image as shown in table obtained by multiplying the lable with corresponding entry in the quantization table p(b/w) and p(b/b)
u2=0+(0.8-0)0.1=0.8 Table: The quantizer lable • The markov model is represented by the state diagram
Table : an 8*8 block of an image 21 0 0 0 0 0 0 • The entropy using a probability model and the iid assumption was significantly
interval [0.656,0.8) is in the upper limit. Send the binary code 1 and scale -9 0 0 0 0 0 0 0 more than the entropy using the markov model
124 125 122 120 122 119 117 118 3 0 0 0 0 0 0 0 • Let us try to interpret what the model says about the structure of the data .
l2=2(0.656-0.5)=0.312 121 121 120 119 119 120 120 118 0 0 0 0 0 0 0 0 • The highly skewed nature of the probabilities p(b/w) and p(w/w),and to a lesser
u2=2(0.8-0.5)=0.6 125 124 123 122 121 121 120 120 0 0 0 0 0 0 0 0 extent p( w/b) and p(b/b), says that once a pixel takes on a particular color, it is
124 124 125 125 126 125 124 124 0 0 0 0 0 0 0 0 highly likely that the following pixels will also be of the same color
• Third element 2 127 127 128 129 130 128 127 125 0 0 0 0 0 0 0 0
l3=0.312+(0.6-0.312)0.8=0.5424 143 142 143 142 140 139 139 139 0 0 0 0 0 0 0 0
u3=0.312+(0.6-0.312)0.82=0.54816 150 148 152 152 152 152 150 151
156 159 158 155 158 158 157 156
interval [0.5424,0.54816) is in the upper limit. Send the binary code 1 and scale The Transform Coding
The transform used in the Jpeg scheme is the DCT .The input image is first “level In this approach the lable for the DC and AC coefficient are coded differently
l3=2(0.5424-0.5)=0.0848 shifted by 2p-1 ie) subtract 2p-1 from each pixel value. Then the image is divided into using Huffman codes. The DC coefficient values partitioned into categories. The
u3=2(0.54816-0.5)=0.09632 blockes of size 8*8 , which are transformed using an 8*8 forward DCT .The table show categories are then Huffman coded. The AC coefficient is generated in slightly different
the DCT coefficient. manner. There are two special codes: End-of-block(EOF) and ZRL
interval [0.0848,0.09632) is in the lower limit. Send the binary code 0 and scale
Table: The DCT coefficient Table: Coding of the differences of the DC labels
l3=2*0.0848=0.1696 • So, rather than code the color of each pixel separately , we can simply code
u3=2*0.09632=0.19264 39.88 6.56 -2.24 1.22 -0.37 -1.08 0.79 1.13 1 0 the length of the runs of each color .
-102.43 4.56 2.26 1.12 0.35 -0.63 -1.05 -0.48 1 -1 1 • For example, if we had 190 white pixels followed by 30 black pixels ,
interval [0.1696,0.19264) is in the lower limit. Send the binary code 0 and scale 37.77 1.31 1.77 0.25 -1.50 -2.21 -0.10 0.23 2 -3 -2 2 3 followed by another 210 white pixels , instead of coding the 430 pixels
-5.67 2.24 -1.32 -0.81 1.41 0.22 -0.13 0.17 3 -7 ……………………. -4 4 ……… 7 individually, we would code the sequence 190, 30, 210, along with an
l3=2*0.1696=0.3392 -3.37 -0.74 -1.75 0.77 -0.62 -2.65 -1.30 0.76
indication of the color of the first string of pixels .
u3=2*0.19264=0.38528 5.98 -0.13 -0.45 -0.77 1.99 -0.26 1.46 0.00 Table: sample table for obtaining the Huffman code for a given label value and run length • Coding the lengths of runs instead of coding individual values is called run-
3.97 5.52 2.39 -0.55 -0.051-0.84 -0.52 -0.13
length coding
interval [0.3392,0.38528) is in the lower limit. Send the binary code 0 and scale -3.43 0.51 -1.07 0.87 0.96 0.09 0.33 0.01 Z/C Codeword Z/c Codeword ….. Z/C Codeword
0/0 1010 F/0 11111111001
l3=2*0.3392=0.6784 Quantization 0/1 00 1/1 1100 F/1 111111111111110101 CCITT GROUP 3 AND 4-RECOMMENDATIONS T.4 AND T.6
u3=2*0.38528=0.77056 The JPEG algorithm uses uniform midthread quantization to quantize the various ……………………
coefficient. The quantizer step sizes are organized in a table called the quantization table
interval [0.6784,0.77056) is in the upper limit. Send the binary code 1 and scale • The one dimensional coding scheme is a run-length coding scheme in which
as shown in table
each line is represented as a series of alternating white runs and black runs.
Table: Sample Quantization table To encode the AC coefficient First using Zigzag scan. We obtain
l3=2(0.6784-0.5)=0.3568 The first run is always a white run. If the first pixel is a black pixel, then we
16 11 10 16 24 40 51 61 -9 3 0 0 0 0 0 ……… 0
u3=2(0.77056-0.5)=0.54112 assume that we have a white run of length zero.
12 12 14 19 26 58 60 55 The first value belong to category 1. transmit the code corresponding to 0/1 follow by a
• Runs of different lengths occur with different probabilities, therefore they are
14 13 16 24 40 57 69 56 single bit 1 to indicate that the value being transmitted is 1 and not -1 .Simillarly other
The interval [0.3598,0.54112) is either in the upper or the lower half of unit interval so coded using a variable length code..
14 17 22 29 51 87 80 62 AC coefficient code are transmited.
proceed 18 22 37 56 68 109103 77 To obtain the reconstruction of the original block Dequantization is performed and taking • The number of possible lengths of runs is extremely large and it is not simply
• Fourth element 1 24 35 55 64 81 104 113 92 inverse transform of the coefficient we get the reconstructed block feasible to build a codebook that large.
l4=0.3568+(0.54112-0.3568)0=0.3568 49 64 78 87 103 121 120 101 7. Describe the concepts of run length coding • Therefore instead of generating a Huffman code for each run length r1, the the
u4=0.3568+(0.54112-0.3568)0.8=0.504256 72 92 95 98 12 100 103 99 run length is expressed in the form
• The model that gives rise to run-length coding is the capon model[40], a two-state R1=64*m+t for t=0.1….63 and m=1,2…..27
Stop the encoding. The lable corresponding to the quantized value of the transform coefficient θij is obtained markov model with state sw and sb • When we have to represent a run length r1, instead of finding a code for r1,we
Binary sequence generated is 110001. Transmit 1 followed by many 0 required by the as • The transition probabilities p(w/b) and p(b/w), and the probability of being in use the corresponding codes for m and t.
word length Lij=θij/Qij+0.5 each state p(sw) and p(sb), completely specify this model . • The codes for t are called the terminating codes and the codes for m are called
6. Explain about JPEG compression scheme in detail. make up codes.

• Except for the optional codes, there are separate codes for black and white run
lengths.
• This coding scheme is generally referred to as a modified Huffman scheme
• In the two dimensional scheme ,instead of reporting the run lengths ,which in
terms our Markov model is the length of time we remain in one state ,we
report the transition times when we move from one state to another state.

You might also like