Questions and Problems with Solutions- Part 1
Discrete Fourier Transform
Questions
1. Explain why it is common to work with the transform of an image instead of the image
itself.
Very often by observing the transform of an image provides more insight into the
properties and characteristics of the image than observing directly the pixel intensities.
Furthermore, we can do easily many operations on the transform of an image. Very
often the large and, therefore, important transform coefficients are concentrated close
to the origin of the transform domain. It is quite common to keep only a subset of the
transform coefficients, i.e., the ones close to the origin and discard the rest. By doing
so, we can easily achieve image compression. We can also achieve various filtering
operations, i.e., low pass, band pass or high pass filtering by selecting and keeping
certain subsets of the transform coefficients.
2. Explain why the Fourier transform amplitude of an image alone often does not capture the
intelligibility of the image.
In viewing a picture, some of the most important visual information is contained in the
edges and regions of high contrast. Intuitively, regions of maximum and minimum
intensity in a picture are places at which complex exponentials at different frequencies
are in phase. Therefore, it seems plausible to expect the phase of the Fourier transform
of a picture to contain much of the information of the picture and, in particular, the
phase should capture the information about the edges.
3. Explain why the Fourier transform phase of an image alone often captures most of the
intelligibility of the image.
Same answer as in 2 above.
4. In a specific experiment, it is observed that the amplitude response of an image exhibits
energy concentration along a straight line in the 2-D frequency plane. What are the
implications of this observation as far as the original image is concerned?
A straight line in space implies a straight line perpendicular to the original one in
frequency (see related problem below.)
5. In a specific experiment, it is observed that the amplitude of the Fourier transform of an
image exhibits high values only very close to the origin and takes very small values within
the rest of the two-dimensional frequency plane. State the implications of this observation as
far as the original image is concerned.
The original image contains mainly low frequency components, i.e., it is quite smooth.
Problems
1. Consider an M N -pixel image f ( x, y) which is zero outside 0 x M 1 and
0 y N 1 . In transform coding, we discard the transform coefficients with small
magnitudes and code/transmit only those with large magnitudes. Let F (u, v) denote the
M N -point Discrete Fourier Transform (DFT) of f ( x, y) . Let G(u, v) denote F (u, v)
modified by
F (u, v), when F (u, v) is large
G(u, v)
0, otherwise
Let G(u, v) chosen such as:
M 1 N 1 2
G (u, v) 9
u 0 v 0
M 1 N 1
2 10
F (u, v)
u 0 v 0
We reconstruct an image g ( x, y) by computing the M N -point inverse DFT of G(u, v) .
M 1 N 1 M 1 N 1
Express ( f ( x, y ) g ( x, y ))2 in terms of f ( x, y ) 2 .
x 0 y 0 x 0 y 0
The signal f ( x, y) g ( x, y) is obtained by the Inverse DFT of the signal
F (u, v) G(u, v) . Therefore, according to Parseval’s theorem the energy of the signal
f ( x, y) g ( x, y) is equal to the energy of the signal F (u, v) G(u, v) . The signal
F (u, v) G(u, v) consists of the DFT samples of F (u, v) which were excluded in
forming G(u, v) . Since, G(u, v) captures 0.9 of the energy of F (u, v) , the signal
F (u, v) G(u, v) will capture 0.1 of the energy of F (u, v) . Therefore,
M 1 N 1 M 1 N 1
( f ( x, y) g ( x, y)) 0.1 f ( x, y)
2 2
x 0 y 0 x 0 y 0
2. Consider an M N pixel gray level image f ( x, y) which is zero outside 0 x M 1
and 0 y N 1 . The image intensity is given by the following relationship:
c, x x0 , 0 y N 1
f ( x, y )
0, otherwise
where c is a constant value between 0 and 255 and x0 is a constant value between 0 and
M 1.
(i) Plot the image intensity.
x0
x
y
(ii) Find the M N -point Discrete Fourier Transform (DFT) of f ( x, y) . Plot its amplitude
response.
1 M 1 N 1 j 2 ( ux / M vy / N ) 1 N 1 j 2 ( ux0 / M vy / N )
F (u, v) f ( x, y ) e f ( x0 , y )e
MN x 0 y 0 MN y 0
ce j 2ux0 / M N 1 j 2vy / N
e
MN y 0
For v 0
F (u, v) e
ce j 2ux0 / M N 1 j 2vy / N ce j 2ux0 / M 1 e j 2vy / N N
ce j 2ux0 / M 1 e j 2vy
MN y 0 MN 1 e j 2vy / N MN 1 e j 2vy / N
j 2vy
We know that e 1 , and therefore, for v 0 we obtain F (u, v) 0 .
For v 0
ce j 2ux0 / M N 1 j 2vy / N ce j 2ux0 / M N 1 0 ce j 2ux0 / M ce j 2ux0 / M
F (u, v) e e N
MN y 0 MN y 0 MN M
ce j 2ux0 / M
v0
F (u, v) M
0 otherwise
v
c
v0
F (u, v) M
0 otherwise
(iii) Compare the plots found in (i) and (ii) above.
As verified, a straight line in space implies a straight line perpendicular to the
original one in frequency.
N 1 1 aN
The following result holds: a x , a 1.
k 0 1 a
Discrete Hadamard Transform
Questions
Let f ( x, y) denote an M N -point 2-D sequence that is zero outside 0 x M 1 ,
0 y N 1 , where M and N are integer powers of 2. In implementing the standard
Hadamard Transform of f ( x, y) , we relate f ( x, y) to a new M N -point sequence H (u, v) .
1. Define the sequence H (u, v) in terms of f ( x, y) .
The 2-D Hadamard transform is defined as
1 N 1 N 1 n1 ( b ( x )b ( u )bi ( y ) bi ( v ))
H ( u, v ) f ( x, y ) ( 1) i i or
N x 0 y 0 i 0
n 1
1 N 1 N 1 (bi ( x )bi ( u )bi ( y )bi ( v ))
H ( u, v ) f ( x, y )(1) i 0
N x 0 y 0
( x)10 bn1 ( x)bn2 ( x)b0 ( x)2 , bi (x ) 0 or 1 for i 0,, n 1
(u)10 bn1 (u)bn2 (u)b0 (u)2 and the same for ( y )10 and (v)10 .
2. Define the concept of sequency in Hadamard transform.
Sequency refers to the number of zero crossings or the number of transitions in the
basis vectors which form the Hadamard matrix.
3. Define the Ordered Hadamard Transform without giving any mathematical equations.
The Ordered Hadamard transform differs from the Hadamard transform only in the
order of basis functions. The order of basis functions of the Ordered Hadamard
transform is such that the sequency of the basis functions increases as the index of the
transform increases.
4. Comment on the energy compaction property of the Ordered Hadamard transform as
compared to that of the standard Hadamard Transform and, as a result of your answer,
explain which of the two transforms is more commonly used in image processing.
The Ordered Hadamard transform exhibits the property of energy compaction while
the standard Hadamard transform does not. Therefore, the standard Hadamard
transform cannot be used for compression.
Problems
Let f ( x, y) denote the following constant 4 4 digital image that is zero outside 0 x 3 ,
0 y 3 , with r a constant value.
r r r r
r r r r
r r r r
r r r r
Give the standard Hadamard Transform of f ( x, y) .
1 3 3 21 1 3 3 21
H (0,0) f ( x, y) (1) (bi ( x )bi ( 0)bi ( y )bi (0)) f ( x, y ) (1) (bi ( x )*0bi ( y )*0)
4 x 0 y 0 i 0 4 x 0 y 0 i 0
1 3 3
2 1
0 1 3 3 16r
f ( x, y) (1) f ( x, y) 4r
4 x 0 y 0 i 0 4 x 0 y 0 4
Since the given image patch is of constant value it is obvious that the rest of the Hadamard
transform coefficients will be zero.
Karhunen Loeve Transform
Questions
Consider the population of random vectors f of the form
f1 ( x , y )
f ( x, y )
f 2 .
f n ( x, y )
Each component f i ( x, y) represents an image. The population arises from their formation across
the entire collection of pixels. Suppose that n 2 , i.e. you have at least three images.
Consider now a population of random vectors of the form
g1 ( x, y )
g ( x, y )
g 2
g n ( x, y )
where the vectors g are the Karhunen-Loeve transforms of the vectors f .
1. Suppose some elements of the diagonal in the covariance matrix of the new population g
are very small. Comment on the significance of these elements in relation to processing the
images.
The element i represents the variance of the image g i ( x, y) . If i has very small
value, then the variance of the image g i ( x, y) is very small. Practically, g i ( x, y) does
not contain any useful information so it can be neglected in the computation of the
inverse KL transform. This elimination is very useful for compression purposes.
2. Suppose that a credible job could be done of reconstructing approximations to the n original
images by using only the two principal component images associated with the largest
eigenvalues (assume n 2 ). What would be the mean square error incurred in doing so?
Express your answer as a percentage of the maximum possible error.
n
Mean square error j .
j 3
n
Maximum error occurs when we keep only the largest eigenvalue and is equal to j .
j 2
n
j
j 3
The error as a percentage of the maximum possible error is 100 n
.
j
j 2
3. Suppose that a credible job could be done of reconstructing approximations to the n
original images by using only the half of the principal component images associated with the
largest eigenvalues. What would be the mean square error incurred in doing so? Express
your answer as a percentage of both the maximum and the minimum possible error. Assume
that n is even.
n
Mean square error j .
n
j 1
2
n
j
n
j 1
2
The error as a percentage of the maximum possible error is then 100 n
.
j
j 2
4. Suppose that the covariance matrix of g turns out to be the identity matrix. Is the
Karhunen-Loeve transform useful in that case? Justify your answer.
It is not useful since the original images are all equally important.
Problems
1. The covariance matrix of the population f calculated as part of the transform is
a b 0
C f b a 0
0 0 c
(i) Suppose that a credible job could be done of reconstructing approximations to the three
original images by using one principal component image. What would be the mean
square error incurred in doing so, if it is known that c a b and b 0 ?
a b 0
The eigenvalues of the covariance matrix C f b a 0 are found by the
0 0 c
following relationship:
a b 0
det b a 0 (c )[(a ) 2 b 2 ] (c )[(a ) b][(a ) b] 0
0 0 c
1 c, 2 a b, 3 a b
If c a b then, because it is given that b 0 , the eigenvalues will be sorted
according to magnitude as a b a b c and therefore, by using only one
principal component the error of reconstruction will be a b c .
(ii) Suppose that a credible job could be done of reconstructing approximations to the three
original images by using two principal component images. What would be the mean
square error incurred in doing so, if it is known that c a b and b 0 ?
If c a b the eigenvalues will be sorted according to magnitude as
c a b a b and therefore by using only two principal components the error of
reconstruction will be a b .
2. The covariance matrix of the population f calculated as part of the transform is
a 0 b2
Cf 0 a b2
b 2 b2 a
Suppose that a credible job could be done of reconstructing approximations to the three
original images by using one or two principal component images. What would be the mean
square error incurred in doing so in each case?
a 0 b2
The eigenvalues of the covariance matrix C f 0 a b 2 are found by the
b 2 b 2 a
following relationship:
a 0 b2
det 0 a
b 2 (a ) (a ) 2 b 4 b 4 (a ) (a ) (a ) 2 2b 4 0
b2 b2 a
1 a
a 2b 2 2,3 a 2b 2
We rearrange the indices according to the magnitudes of eigenvalues as follows:
1 max a 2b 2
2 a
3 min a 2b 2
The maximum error is 2 3 2a 2b 2 .
If we use one principal component the error is equal to the maximum possible error.
If we use two principal components the error is equal to 3 min a 2b 2 and as a
a 2b
percentage of the maximum possible error is 100 .
2a 2b