Image Transforms
Why Do Transforms?
Fast computation
E.g., convolution vs. multiplication for filter with
wide support
Conceptual insights for various image processing
E.g., spatial frequency info. (smooth, moderate
change, fast change, etc.)
Obtain transformed data as measurement
E.g., blurred images, radiology images (medical
and astrophysics)
Often need inverse transform
May need to get assistance from other
transforms
For efficient storage and transmission
Pick a few “representatives” (basis)
Just store/send the “contribution” from each
basis
Introduction
Image transforms are a class of
unitary matrices used for
representing images.
An image can be expanded in terms
of a discrete set of basis arrays called
basis images.
The basis images can be generated
by unitary matrices.
One dimensional orthogonal and unitary
transforms
For a 1-D sequence {u(n), 0 n N 1}represented as
a vector u of size N, a unitary transformation is
written as
N 1
v(k ) a(k, n)u(n) , 0 k N v=
1
n0
Au
One dimensional orthogonal and unitary
transforms
N 1
u(n) v(k)a * (k, n) , 0 n N u=A v *T
1
k 0
v(k) is the series representation of the sequence
u(n).
The columns of A*T, that is, the vectors a*k {a* (k, n), 0 n N
are called the basis vectors of A. 1}T
Two-dimensional orthogonal and
unitary transforms
A general orthogonal series expansion for an
NxN image u(m,n) is a pair of transformations of
the form
N 1 N 1
y(k, l) x(m, n)a k ,l (m, n)
m0
N 1 n0
N 1
x(m, n) y(k, l)a (m,
*
k 0 l 0 n) k ,l
where a k ,l
(m, n) ,called an image transform, is a set of complete
orthonormal discrete basis functions.
Separable unitary
transforms
Complexity : O(N4)
Reduced to O(N3) when transform is separable i.e.
ak,l(m,n) = ak(m) bl(n) =a(k,m)b(l,n) where
{a(k,m), k=0,…,N-1},{b(l,n), l=0,…,N-1}
are 1-D complete orthonormal sets of basis vectors.
Separable unitary
transforms
A={a(k,m)} and B={b(l,n)} are
unitary matrices i.e. AA*T = ATA* = I.
If B is same as A
N 1 N 1
y(k, l) a(k, m)x(m, n)a(l, n) Y
AXAT N 1 N 1
m0 n0
x(m, n) a* (k, m) y(k, l)a * (l, n) X = A*T
YA*
k 0 l 0
Basis Images
a k*
Let denote the* kth *column
*T of A *T
. Define the matrices
A =a a
k,l k
l
then N 1 N 1
X y ( k , l ) A *k, l
k 0 0 l
y ( k , l ) X , Ak,* l
The above equation expresses image X as a linear
combination of the N2 matrices A *k,l
k, l 0,..., N 1 , called
the basis images.
8x8 Basis images for discrete cosine transform.
Exampl
e Consider an orthogonal matrix A and image X
1 2 1 1 1 1 2 1 1 5 1
A 1 1 1 Y = AXAT
2 1
1
X 2 1 1 3 4 1 1 2
0
3 4
To obtain basis images, we find the outer product of the
columns of A*T
1 1 1 1 1 1 1 1 * 11 1
*
A0,0 1 1 A*0,1
T
A*1,0 A1,1
2 1 2 1 1 2 1 1 2 1 1
The inverse transformation gives
15
1 1 1 1 1
X = A YA
*T *
2 1 1 2 0 1 1 3 4
1 2
Properties of Unitary Transforms
Energy Conservation
In unitary transformation, y = Ax and ||y||2 = ||x||2
Proof:
N 1 N 1
2
2
2 2
y y(k ) y * T y = x * T A * T Ax = x * T x x(n)
x
k 0 n0
This means every unitary transformation is simply a rotation
of the vector x in the N-dimensional vector space.
Alternatively, a unitary transformation is rotation of the basis
coordinates and the components of y are the projections of x
on the new basis.
Properties of Unitary Transforms
Energy compaction
Unitary transforms pack a large fraction of the
average energy of the image into a relatively
few components of the transform coefficients.
i.e. many of the transform coefficients
contain very little energy.
Decorrelation
When the input vector elements are highly
correlated, the transform coefficients tend to be
uncorrelated.
Covariance matrix E[ ( y – E(y) ) ( y –
E(y) )*T ].
small correlation implies small off-diagonal terms.
1-D Discrete Fourier Transform
The discrete Fourier transform (DFT) of a sequence {u(n), n=0,…,N-1} is
defined as
N 1
1
y(k )
N
x(n)W N
nk
, k 0,..., N -
n
0
1
where WN exp j2
N
The inverse transform is given by
N 1
1
x(n)
N
y(k)W nk
N , n 0,..., N -
k
0
1
The NxN unitary DFT matrix F is given by
nk
1
F N N , 0 k, n N
1
W
DFT Properties
Circular shift u(n-l)c = x[(n-l)mod N]
The DFT and unitary DFT matrices are
symmetric i.e. F-1 = F*
DFT of length N can be implemented by a
fast algorithm in O(N log2N) operations.
DFT of a real sequence {x(n), n=0,…,N-1}
is conjugate symmetric about N/2.
i.e. y*(N-k) = y(k)
The Two dimensional DFT
The 2-D DFT of an N x N image {x(m,n)} is a separable
transform defined as
N 1 N 1
1
y(k, l)
N
x(m, n)WNkmWNln , 0 k, l N
m0 -1
n0
The inverse transform is
N 1 N 1
1
x(m, n)
N
y(k, N
km
WNln , 0 m, n N
k 0 l
0
l)W -1
In matrix notation Y = FXF & X = F*YF*
Properties of the 2-D DFT
Symmetric, F, F -1
F *
= F *
T
unitary. F
y(k FN
* ,l N ) k, l
Periodic
y(k,l),
x(m N,nN) m,
Conjugate
x(m, *n), n
y(k,l) y (N k, N l), 0 k,l N
Symmetry 1
Fast
O(N2log2N)
transform 1
ABasis
k* , k T
N
WN
(kmln)
, 0 m, n N 1,0 k, l N
l l
Images
1
2-D pulse DFT
Square Pulse
2D sinc function
FT is Shift Invariant
After shifting:
• Magnitude stay constant
• Phase changes
Rotation
• FT of a rotated image also
rotates
The Cosine Transform (DCT)
The N x N cosine transform matrix C={c(k,n)}, also known
as discrete cosine transform (DCT), is defined as
1
, k 0, 0 n N
N 1
C(k, n)
2 (2n 1)k , 1 k N 1, 0 n N
cos
N
1
2N
The 1-D DCT of a sequence {x(n), 0 ≤ n ≤ N-1} is defined as
, 0kN
N 1
n 2N 1
The inverse transformation is given by
0
0nN
kN 1 2N 1
0
where (0) N1 , (k) = N2 for 1 k N
1
Properties of DCT
The DCT is real and orthogonal
i.e. C=C*C-1=CT
DCT is not symmetric
The DCT is a fast transform : O(N
log2N)
Excellent energy compaction for
highly correlated data.
Useful in designing transform coders and
Wiener filters for images.
2-D DCT
The 2-D DCT Kernel is given by
(2m 1)k
C(m, n, k,l) (k)(l) cos cos
(2n 1)l
2N
where 1
k 2N
(k) N 0
2
1kN
N
1
Similarly for
(l)
DCT example
a) Original image b) DCT image
The Sine Transform
The N x N DST matrix Ψ (k,
is defined as
n)
(k, n) 2
sin
(n 1)(k 0 k, n N
1
1)
,
The sine transform pair of 1-D sequence is defined as
N 1 N 1
N 1
y(k) x(n) (k, n), 0 k
N 1 N 1
n0
x(n) (k, n) y(k), 0 n
N 1
k 0
The properties of Sine
transform
The Sine transform is real, symmetric, and
orthogonal
Ψ * = Ψ = Ψ T = Ψ -1
The sine transform is a fast transform
It has very good energy compaction property
for images
The Hadamard transform
The elements of the basis vectors of the
Hadamard transform take only the binary
values ±1.
Well suited for digital signal processing.
The transform matrices Hn are N x N matrices,
where N=2n, n=1,2,3.
Core matrix is given by
H1 1 1 1
2 1 1
The Hadamard transform
The matrix Hn can be obtained by kroneker product recursion
H n Hn1 H1 1 H n1 H n1
2 H n1 H n1
Example H 3 H 2 H1 & H 2 H1
1 H1 1 1 1 1
1
1 1 1
1 1 1 1 1 1
1
H3 1 1 1 1 1 1 1
8 1
1 1
1 1 1 1 1 1 1 1 11
1 1
1 1 1 1 1 1
1
1 1 1 1 1 1 1
1 1 1 1 1
The Hadamard transform properties
The number of sine changes in a row is called
sequency. The sequency for H3 is 0,7,3,4,
1,6,2,5.
The transform is real, symmetric and
orthogonal.
H * = H = H T = H -1
The transform is fast
Good energy compaction for highly correlated
data.
The Haar transform
The Haar functions hk(x) are defined on a continuous
interval, x [0,1], and for k = 0,…,N-1, where N = 2n.
The integer k can be uniquely decomposed as k = 2p + q -1
where 0 ≤ p ≤ n-1; q=0,1 for p=0 and 1 ≤ q ≤ 2p for p≠0.
For example, when, N=4
k 0 1 2 3
p 0 0 1 1
q 0 1 1 2
The Haar transform
•The Haar functions are defined as
1
h0 (x) h 0 , 0 (x) , x
N [0,1]
p /2 1
q 2
2 , q 1
x
2 p
2p
1 1
hk ( x) hp ,q 2 p /2
, q 2 x q
N 2p 2
0, otherwise
p for x [0,1]
Haar transform example
The Haar transform is obtained by letting x take discrete
values at m/ N, m=0,…,N-1. For N = 4, the transform
is
1 1 1
1
Hr
2 0
01
14 1 1 1
0
2
0 2
2
Properties of Haar transform
The Haar transform is real and
orthogonal
Hr = Hr*
and
Hr-1
= HrT
Haar transform is very fast:
O(N)
The basis vectors are
sequency ordered.
KL transform
Hotelling transform
Originally introduced as a series
expansion for continuous random
process by Karhunen and Loeve.
The discrete equivalent of KL series
expansion – studied by Hotelling.
KL transform is also called the
Hotelling transform or the method of
principal components.
KL transform
Let x = {x1, x2,…, xn}T be the n x 1 random
vector.
For K vector samples from a random
population, the mean vector is given by
1
mx xk
K k
1
The covariance matrix of the population is
given by
1
x x T
m m T
C x
K k
k k
x
x
1
KL Transform
Cx is n x n real and symmetric matrix.
Therefore a set on n orthonormal eigenvectors
is possible.
Let ei and i, i=1, 2, …, n, be the eigenvectors
and corresponding eigenvalues of Cx, arranged
in descending order so that j ≥ i+1 for j = 1, 2,
…, n.
Let A be a matrix whose rows are formed from
the eigenvectors of Cx, ordered so that first row
of A is eigenvector corresponding to the largest
eigenvalue, and the last row is the eigenvector
KL Transform
Suppose we use A as a transformation
matrix to map the vectors x’s into the
vectors y’s as follows:
y = A(x – mx)
This expression is called the Hotelling
transform.
The mean of the y vectors resulting from
this transformation is zero; that is my =
E{y} =0.
KL Transform
The covariance matrix of the y’s is given in
terms of A and Cx by the expression
Cy = ACxAT
Cy is a diagonal matrix whose elements along
the main diagonal are the eigenvalues of Cx
1 0
2
Cy
0
n
KL Transform
The off-diagonal elements of this
covariance matrix are 0, so that the
elements of the y vectors are
uncorrelated.
Cx and Cy have the same eigenvalues
and eigenvectors.
The inverse transformation is given
by
x = ATy + mx
KL transform
Suppose, instead of using all the eigenvectors
of Cx we form a k x n transformation matrix
Ak from k eigenvectors corresponding to k
largest eigenvalues, the vector reconstructed
by using Ak is
x = AT y + m
k x
The mean square e rror be tween x an d
x
is n k n
ems
j
j
j 1
KL Transform
As j’s decrease monotonically, the error can be
minimised by selecting the k eigenvectors
associated with the largest eigenvalues.
Thus Hotelling transform is optimal i.e. it
minimises the min square error between x
x
and
Due to the idea of using the eigenvectors
corresponding to the largest eigenvalues, the
Hotelling transform is also known as the
principal components transform.
KL transform example
3
0 1 1 1 1 1 3 1 1
mx 1 C = 1 3 1
x1 0 x2 0 x3 1 x4 0 x
0 0 0 1 4 16 1
1
1
1 0.0625 2 3
0.2500 0.5774
0.2500 -0.4330
3
0.1443
0.5774 0.1443
0.6172 y= 0.1543 -0.0000 0.1443 0.6172
0.8018 0.5774
0.2673 0.5345 -0.8018 0.0000 0.2673 0.5345
-0.7715
A = -0.1543 -0.7715
0 0.0833 0.0000 0.0000
my Cy = 0.0000 0.3333 0.0000
0.0000 0.3333
0 0 0.0000
KL Transform Example
a b
c d
a) Original Image,
b) Reconstructed using all the three principal components,
c) Reconstructed image using two largest principal components,
d) Reconstructed image using only the largest principal component