Math5390 Chapter5
Math5390 Chapter5
In Chapter 4, we have talked about how we can enhance an image in the frequency domain. In
this chapter, we will talk about how image enhancement can be done in the spatial domain.
Definition 1.1. Linear filter is a process to modify the pixel value by a linear combination of
the pixels values of its local neighbourhood.
Example 1.2. Let f be an N × N image. Extend the image periodically. Modify f to f˜ by:
1
2. A geometric illustration of the idea is as follows.
ˆ
I I
∗H
∼
= × Ĥ
y y
I˜ −−
←−−−DF
−− −−
− −T−
−− −−
−−
−−
→− Iˆ˜
iDF T
(spatial) (frequency)
f˜ = f ∗ H
where
0 2 0
H= 0 1 0
0 3 0
• Mean filter:
1 1 1
1
H= 1 1 1
9
1 1 1
(Here, we only write down the entries of the matrix for indices −1 ≤ k, l ≤ 1 for simplicity.
All other matrix entries are equal to 0.)
This is called the mean filtering with window size 3 × 3.
Below is an example of mean filtering on an image with impulse noise:
2
Below is an example of mean filtering on an image with Gaussian noise:
r2
• Gaussian filter: The entries of H are given by the Gaussian function g(r) = exp − 2 ,
p 2σ
where r = x2 + y 2 .
Below is an example of Gaussian filtering on an image:
(Please refer to “Lecture 17 powerpoint” for some more examples of mean filter and Gaussian
filter on real images)
• Associativity: A ∗ (B ∗ C) = (A ∗ B) ∗ C
• Commutativity: I ∗ H = H ∗ I
• Linearity:
(s · I) ∗ H = I ∗ (s · H) = s · (I ∗ H)
(I1 + I2 ) ∗ H = (I1 ∗ H) + (I2 ∗ H)
3
Remark. 1. Advantage of Gaussian filter: We can check that convolution of Gaussian filter is
again Gaussian (with larger σ).
2. Thus, successive Gaussian filter = Gaussian filter with a larger σ (because of the property of
associativity).
Proof of associativity
N −1
mn
X
[(x ∗ y) ∗ z](n) = ∗ y) ∗ z(m)ej2π
(x\ N
m=0
N −1
mn
X
= ∗ y(m)ẑ(m)ej2π
C xd N
m=0
N −1
mn
X
= C 2 x̂(m)ŷ(m)ẑ(m)ej2π N
m=0
N −1
mn
X
= ∗ z(m)ej2π
C x̂(m)yd N
m=0
N −1
mn
X
= (y ∗ z)(m)ej2π N
x ∗\
m=0
= (x ∗ (y ∗ z))(n)
Median filter
Take a window with center at pixel (x0 , y0 ). Update the pixel value at (x0 , y0 ) from I(x0 , y0 ) to
˜ 0 , y0 ) = median(I within the window)
I(x
Example 1.5. If the pixel values within a window are 0, 0, 1, 2, 3, 7, 8, 9, 9, then the pixel value is
updated as 3 (median).
4
Below is a comparison of mean and median filtering on an image:
Edge-preserving filter
various windows of
fixed size
• Step 1: Consider all windows with certain size around pixel (x0 , y0 ) (not necessarily centered
5
at (x0 , y0 ));
• Step 2: Select a window with minimal variance;
• Step 3: Do a linear filter (mean filter, Gaussian filter and so on).
Below is an example of edge preserving filtering on an image (compared with other types of
filtering):
SX = {(x + s, y + t) : −a ≤ s, t ≤ a} ; SX 0 = {(x0 + s, y 0 + t) : −a ≤ s, t ≤ a}
Denote gX = g|SX and gX 0 = g|SX 0 , then gX and gX 0 are two m × m small images.
We call gX and gX 0 the local patch of g at X and X 0 respectively.
Apply Gaussian filter (linear) to gX and gX 0 to get g̃X and g̃X 0
Define the least square distance between two local patches as:
||g̃X − g̃X 0 ||2 = sum of squares of coefficients of the matrix (g̃X − g̃X 0 )
Remark:
6
• The image g is often: 1. Periodically extended; or 2. set outside region to have zero pixel
values.
• The weight is smaller if the overall intensities over a local patch at X and X 0 are different.
(Please see “Lecture 18 powerpoint” for more illustration of non-local mean filter on real images)
∂v1 ∂v2
(where ∇· = divergence defined by ∇ · ((v1 , v2 )) = + and
∂x ∂y
∂I ∂I
∇I = gradient of I = , )
∂x ∂y
Then, the Gaussian function with standard deviation σ,
2
x + y2
1
g(x, y; σ) = exp −
2πσ 2 2σ 2
7
This is analogous to the discrete convolution:
XX
I ∗ J(u, v) = I(u − m, v − n)J(m, n)
m n
Z ∞ Z ∞
∂ Ie ∂g(u, v; σ)
∴ = I(x − u, y − v) du dv
∂σ −∞ −∞ ∂σ
Z ∞Z ∞
∂ 2 g(u, v; σ)
=σ I(x − u, y − v) du dv
−∞ −∞ ∂u2
∞ Z ∞
∂ 2 g(u, v; σ)
Z
+σ I(x − u, y − v) du dv
−∞ −∞ ∂v 2
Z ∞Z ∞
∂2
=σ 2 g(u, v; σ)I(x − u, y − v) du dv
∂x −∞ −∞
Z ∞Z ∞
∂2
+σ 2 g(u, v; σ)I(x − u, y − v) du dv
∂y −∞ −∞
= σ∇ · (∇I(x,
e y; σ))
1
1. K(x, y) = (not good as ∇I can be ~0)
|∇I(x, y)|
1
Modification: K(x, y) = for ε small
|∇I(x, y)| + ε2
|∇I(x, y; σ)|
2. K(x, y) = exp −
b
Remark. We can choose K such that K depends on |∇I| and K ≈ 0 when |∇I| is big.
Hence, the anisotropic diffusion algorithm for solving the image de-noising problem can be written
as:
∂I(x, y; σ) |∇I(x, y; σ)|
= ∇ · exp − ∇I(x, y; σ) (∗∗)
∂σ b
8
where D1 = linear operator approximating ∇· and
D2 = linear operator approximating the gradient ∇
Starting with I 0 (x, y) = I(x, y), which is the original image, we iteratively de-noise the original
image I(x, y) := I 0 (x, y). Such a process is called anisotropic diffusion image denoising.
Below is an example of (edge-preserving) anisotropic diffusion on an image:
Let g be the noisy image. We will consider additive noise. So, g can be written as:
9
Step 2: For n ≥ 0 and for all (x, y), x = 1, · · · , M, y = 1, · · · , N ,
(similar for f n )
Step 3: Continue the process until ||f n+1 −f n || ≤ tolerance. (Convergence depends on the spectral
radius of a matrix)
In fact, the solution of (***) is a minimizer of an energy (taking into account the extension by
reflection outside the domain):
M X
X N M X
X N
Ediscrete (f ) = (f (x, y) − g(x, y))2 + [(f (x + 1, y) − f (x, y))2 + (f (x, y + 1) − f (x, y))2 ]
x=1 y=1 x=1 y=1
∂Ediscrete
0=
∂f (x, y)
= 2(f (x, y) − g(x, y)) + 2(f (x + 1, y) − f (x, y))(−1) + 2(f (x, y + 1) − f (x, y))(−1)
+ 2(f (x, y) − f (x − 1, y)) + 2(f (x, y) − f (x, y − 1))
After rearrangement,
Remark:
DF T (f ) = DF T (g + ∆f ) = DF T (g + l ∗ f )
⇔ DF T (f )(u, v) = DF T (g)(u, v) + cDF T (l)(u, v)DF T (f )(u, v)
1
⇔ DF T (f )(u, v) = DF T (g)(u, v)
1 − cDF T (l)(u, v)
10
1.5 Image denoising by solving PDE
We find f that minimizes E(f ). Take any function v(x, y), and consider
d
If f is the minimizer, then s(ε) = 0
dε ε=0
Z Z
∴ s0 (0) = 0 = 2 (f (x, y) − g(x, y))v(x, y) dx dy + 2 (fx vx + fy vy ) dx dy
Ω Ω
Overall, we get Z Z
(f − g − ∆f )v dx dy − (∇f · ~n)v ds = 0
Ω ∂Ω
from where we obtain: (
f − g − ∆f = 0 in Ω
(PDE)
∇f · ~n = 0 on ∂Ω
Conversely, given f such that the above PDE is satisfied, for any other h
Z
E(h) − E(f ) = [(h − g)2 − (f − g)2 + |∇h|2 − |∇f |2 ] dx dy
Ω
Z
= [(h − g) − (f − g)]2 + |∇h − ∇f |2 + 2∇f · (∇h − ∇f ) + 2(f − g)(h − f )] dx dy
ZΩ
≥ 2∇f · ∇(h − f ) + 2(f − g)(h − f ) dx dy
Ω
Z Z
=2 − (∇ · ∇f )(h − f ) + (f − g)(h − f ) dx dy + 2 ∇f·
( : 0)(h − f ) = 0
~n
Ω | {z } ∂Ω
=0
11
2. Solving PDE
Intuitively, we want to minimize “derivative” / “jump” |∇f (x, y)|. On edges, K(x, y) is
small, so contribution of |∇f (x, y)| on the edges is small and thus less minimization of
|∇f (x, y)| on the edges.
Remark. Again, a problem arises when |∇f |(x, y) = 0. We will take care of it later.
We will show that the PDE (****) must be satisfied by a minimizer of the following energy
functional: ZZ ZZ
1 2
J(f ) = (f (x, y) − g(x, y)) dx dy + λ |∇f |(x, y) dx dy
2 Ω Ω
where Ω is the image domain.
∂J
To do this, we must have “ = 0”. Let’s discretize J(f ). The discrete version of J(f ) is
∂f
N N N X N p
1 XX X
J(f ) = (f (x, y) − g(x, y))2 + λ (f (x + 1, y) − f (x, y))2 + (f (x, y + 1) − f (x, y))2
2 x=1 y=1 x=1 y=1
∂f ∂f
(Finite difference approximation of and with ∆x = ∆y = 1)
∂x ∂y
Therefore,
12
By simplification, we get:
(
f (x + 1, y) − f (x, y)
f (x, y) − g(x, y) = λ p
(f (x + 1, y) − f (x, y))2 + (f (x, y + 1) − f (x, y))2
)
f (x, y) − f (x − 1, y)
− p
(f (x, y) − f (x − 1, y))2 + (f (x − 1, y + 1) − f (x − 1, y))2
(
f (x, y + 1) − f (x, y)
+λ p
(f (x + 1, y) − f (x, y))2 + (f (x, y + 1) − f (x, y))2
)
f (x, y) − f (x, y − 1)
− p
(f (x + 1, y − 1) − f (x, y − 1))2 + (f (x, y) − f (x, y − 1))2
for all 1 ≤ x, y ≤ N .
∇f
(Discretization of f − g = λ∇ · .)
|∇f |
The above equation is a non-linear equation. (Difficult to solve!!) We apply the gradient descent
method to solve it iteratively.
General gradient descent method
We consider the general case to solve:
minf J(f )
where f can be a number, a vector or a function. In our case, f is a discrete image function
(x, y) 7→ f (x, y). Hence, we consider J to be a function defined on a discrete image, which depends
on M × N variables (assuming the image is of size M × N ).
Consider a time-dependent image f (x, y; t). Assume f (x, y; t) solve the ODE:
df (x, y; t)
= −∇J(f (x, y; t)) (gradient descent equation)
dt
Therefore, J(f (x, y; t)) is decreasing as t increases. We solve (*) and f (x, y; t) → f¯(x, y) is the
minimizer.
We solve the gradient descent equation iteratively. Let f 0 (x, y) be the initial guess. (Here 1 ≤ x ≤
M, 1 ≤ y ≤ N and extend f 0 by reflection.)
We solve (*) in a discrete sense:
f n+1 − f n
= −∇J(f n ) (∗ ∗ ∗ ∗ ∗)
∆t
13
In our case,. (*****) becomes:
Remark. It is the TV de-noising model with image blur incorporated. We proceed to solve the
optimization problem by steepest (gradient) descent method.
Z ∞Z ∞
Observation: Let Hf (x, y) := h ∗ f (x, y) = h(α, β)f (x − α, y − β) dx dy.
Z ∞Z ∞ −∞ −∞
hHf, gi = hf, H ∗ gi
14
where H ∗ is the adjoint of H. We want to compute H ∗ .
Z ∞Z ∞
hHf, gi = h ∗ f (x, y)g(x, y) dx dy
−∞ −∞
Z ∞Z ∞ Z ∞ Z ∞
= h(α, β)f (x − α, y − β) dα dβ g(x, y) dx dy
−∞ −∞ −∞ −∞
Z ∞ Z ∞ Z ∞ Z ∞
= f (x − α, y − β)g(x, y) dx dy h(α, β) dα dβ
−∞ −∞ −∞ −∞
Let X = x − α, Y = y − β, we obtain
Z ∞ Z ∞ Z ∞ Z ∞
hHf, gi = f (X, Y )g(X + α, Y + β) dX dY h(α, β) dα dβ
−∞ −∞ −∞ −∞
Z ∞Z ∞ Z ∞ Z ∞
= f (X, Y ) h(α, β)g(α + X, β + Y ) dα dβ dX dY
−∞ −∞ −∞ −∞
f n+1 − f n
1
= −h̃ ∗ (h ∗ f n − g) + λ∇ · ∇f n
dt |∇f n |
1 n
with dt the time step and suitable parameter added to ∇ · ∇f to avoid singularity.
|∇f n |
Remark. 1. Here, we consider the continuous image to be defined on the whole domain R2 .
2. In practice, if we are given an image defined on a compact domain Ω, we extend the image
to the whole R2 by setting the region outside Ω to be zero.
15
3. Let TV-deblurring-denoising model simultaneously deblur an image and also denoise the im-
age, which preserves edges.
Further remark: In the discrete case, we may consider finding f (x, y) (0 ≤ x, y ≤ N − 1) which
minimizes:
N
X −1 N
X −1 N
X −1 N
X −1
E(f (x, y)) = (h ∗ f (x, y) − g(x, y))2 + α |∇f (x, y)|
x=0 y=0 x=0 y=0
We can again derive the gradient descent iterative scheme to minimize E(f ).
16
3 Image sharpening in the spatial domain (Optional)
The image sharpening in the spatial domain is exactly the same as that in the frequency domain.
Instead of working on the frequency domain, we work on the spatial domain directly.
Let f be an input image. To sharpen the image, we compute a smoother image (by Gaussian
filtering or mean filtering) fsmooth . Define the sharper image g as:
Exercises
The entries of all matrices below are indexed by {0, 1, · · · , M − 1} × {0, 1, · · · , N − 1} unless
otherwise specified.
17
ii. Verify whether the necessary condition is sufficient to guarantee optimality;
iii. Derive an iterative scheme to minimize the functional;
iv. Discretize the energy functional (with forward difference for first derivatives and central
difference for second derivatives);
v. Find a necessary condition for optimality for the discretized functional;
vi. Derive an iterative scheme to minimize the discretized functional.
Z
(a) E1 (f ) = (f − g)2 + λk∇f k2 dx dy, λ > 0;
ZΩ
(b) E2 (f ) = [(f − g)2 + Kk∇f k4 ] dx dy, K(x, y) > 0 non-constant;
ZΩ
(c) E3 (f ) = [(h ∗ f − g)2 + Kk∇f k] dx dy, K(x, y) > 0 non-constant.
Ω
∂I(x, y; σ)
= ∇ · (K(x, y)∇I(x, y; σ))
∂σ
from minimizing the following energy functional:
Z
E(I) = K(x, y)k∇I(x, y)k2 dx dy.
Ω
18