Digital Image Processing - Advanced Course
Digital Image Processing - Advanced Course
We can assume that image has almost constant values with small variations in several neighbor pixels and after that we have abrupt changes of luminance. For image filtering it is important that image energy is concentrated in the smallest transformation coefficients. Similar property is desirable for image compression.
Hadamard transform
Here, we will introduce the Hadamard transform for N=2n samples. There are some alternative forms of Hadamard transform for other number of samples. Hadamard transform derivation begins with matrix:
1 1 2 = 1 1
Hadamard transform for N=2 is defined with transformation matrix:
1 H2 = 2 2
Hadamard transform
Hadamard transform of larger dimensions can be defined using recursively as: N / 2 N / 2 1 where HN = N N = N / 2 N / 2 N
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
OP 0 PP 7 3 PP 4 PP 1 PP 6 PP 2 Q5
Number on the end of each row is number of sign changes in the row (from 1 to 1 and vice versa).
Walsh ordering
Very often instead of the Hadamard transform we are using reordered form of the Hadamard transform (this reordering is called Walsh ordering and sometimes the corresponding transform is called the Walsh transform). Reordering is performed in such manner that on the top of the transform matrix is put the row of the Hadamard transform matrix with the smallest number of sign changes (from 1 to -1 and from -1 to 1) and below rows are ordered in increasing order of sign changes. Hadamard and Walsh transforms are equivalent to each other but the Walsh transform order has analogy with the sinusoidal transform with respect to increasing of frequencies represented with corresponding coefficients.
1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1
Several alternative rectangular transforms are reviewed in textbook. Here, we will due to its importance just mention the Haar expansion.
Columns of the Walsh matrix are quite similar to the sinusoidal function used in sinusoidal transforms (expansions).
Haar transform
Definition of the Haar transform is not quite simple. Consider integer k defined in domain 0kN-1. It can be written as: k=2p+q-1. Integers p and q are selected in such a manner that p is the largest integer that satisfy 2pk. Values of k represents rows of the Haar transform matrix (here the first row has index 0, next one has index 1 while the last one has the index N-1). The second index (corresponding to columns) is denoted with i and this index is also defined within the range 0iN-1. Now we can defined x as x=i/N.
Haar transform
There is direct relationship between x and column index i. Elements in rows of the Haar transform are denoted as hk(x) where k is row index described on the previous slide. These vectors are defined as: q 1/ 2 p/2 q 1 x< p 2 2 2p 1 p/2 q 1/ 2 q 1 hk ( x ) = x< p h0 ( x) = 2 p 2 2 N N elsewhere 0
HT N=2
Haar transformer for N=2 is created in such manner than one branch is used to peek up two adjacent samples and to add them (with some multiple) while the second branch is used to subtract adjacent samples.
Assume that we have 4 samples and assume that we try to transform these 4 samples using HT for 2 samples. Then this transform will perform operation on the first pair of samples and after that on the second pair of samples. Results of operations applied to pairs of samples are submitted in upper and lower lines.
HT N=2
In order to get the HT for N=4 samples we can note that lower line already produces 2 samples of the output while we have to apply the HT for N=2 for upper line.
x(n), n=0,1,2,3
HT N=2
HT N=2
x(n), n=0,...,7
HT N=2
HT N=2
Optimal transform
The Haar transform will be used for introduction of very important class of transforms called wavelets. These transforms have slightly different logic behind with respect to the introduced transforms. They are appropriate for digital image processing since energy of images is hidden on low frequencies while details are on high frequencies. Before we proceed with wavelet transform description we want to address the question what the ideal (optimal) transform would be for us? Based on introduced facts we can conclude that the best transform would be the one concentrating the signal energy in the smallest number of transformation coefficients.
Optimal transform
Assume that we have transform with transformation matrix A. Then transformation can be written as: X=A x Introduce important concept of eigenvalues and eigenvectors of matrix A. Eigenvalues i are solution of equation:
| A I |= 0
Let solutions of this equation be: i, i=1,...,N. These solutions are called eigenvalues.
Eigenvalues
For a given i we should determine vector vi satisfying: Aqi=iqi It is easy to show that under given conditions (when i is solution of previously defined equation) there are infinite number of vectors satisfying this equation. We can limit number of solutions to 1 by normalization of vector qi in such a way that its amplitude (square root of sum of squares of vector elements) is equal to 1, ||qi||=1.
Matrix of eigenvectors
Matrix of eigenvectors can be formed by putting eigenvectors in rows of matrix. This matrix denoted with Q has numerous interesting properties. For example, it is orthogonal matrix: Q-1=QT System of equations Aqi=iqi can be written in matrix form as: AQ=Q where is diagonal matrix of eigenvalues.
Eigenvalues matrix
It holds: QTAQ= QQT=A Determine cross-correlation of transform X: CXX=E{XXT}=E{AxxTAT}=AE{xxT}AT
For self-exercise review basic matrix operations.
CXX=AcxxAT
Energy concentrated in transform can be connected with energy of matrix CXX and this energy has the smallest loss when CXX is a diagonal matrix.
Ideal transform
It is easy to conclude that ideal matrix CXX is produced when A is matrix of eigenvalues of cxx. When cxx is known we can determine ideal transform of A as a matrix of eigenvalues. It can be shown that when we use just M the largest transformation coefficients of vector obtained mean squared error is equal to:
MSE =
i = M +1
eigenvalues corresponding to truncated coefficients, i.e., sum of the smallest N-M eigenvalues
Ideal transform
It can be shown that this transform produces the smallest MSE when we take M transformation coefficients. This transform is called the Karhunen-Loeve (KL) and it represents ideal transform that is goal for compression and filtering applications. Why we are not using the KL? The answer is quite simple. The KL requires calculation of the auto-correlation matrix and eigenvalues. Both of these operations are quite demanding.
Eigenvalues - Examples
We will review steps in calculation of eigenvalues by hand and how it can be implemented in MATLAB. The following matrix is given:
1 2 1 R = 3 1 0 1 0 1 1 det(R ) = 3 1 2
In MATLAB kar_jed=poly(A)
1 0 = 3 + 2 + 6 + 6 0 1
Eigenvalues - Examples
There are no direct approach for solving polynomial equation of higher order but to use numerical means. Then it is quite usable MATLAB function eig(R) that produces eigenvalues. In our case eigenvalues are: 13.337 2,3-1.168 j0.658. Determine now one of the eigenvectors by hand. It is eigenvector corresponding to the first eigenvalue 1.
Eigenvalues - Examples
For this eigenvalue holds: 2 1 q11 2.337 3 0 q21 = 0 2.337 0 4.337 q31 1 q31=q11/4.337=0.231q11 q21=3/2.337q11=1.284q11 It is easy to prove that this expression satisfy the first equation (accuracy is limited by small error in rounding) that is not independent on the other two. -2.337q11+2q21-q31=0
Eigenvalues - Example
Then the corresponding eigenvector is [q11, 1.284q11,0.231q11]. This vector can be normalized to modulo 1:
2 2 q11 + (1.284q11 ) 2 + (0.231q11 ) 2 = 2.702q11 = 1
q11 = 0.608
Then the eigenvector is [0.608,0.781,0.140]. MATLAB function can perform this in quite simple manner: [Q,L]=eig([1 2 -1;3 1 0;1 0 -1])
matrix of eigenvectors diagonal matrix of eigenvalues
Eigenvalues - Example
The KL transform can be calculated for line of image as: clear Im=imread('cameraman.tif'); A=double(Im(180,:)); A=A(113:142); A=A-mean(A); [KL,L]=eig(A'*A) In numerous cases just one eigenvalue corresponds to the most of the image energy and from entire matrix in the KL transform it is enough to memorize just several transformation coefficients (eigenvalues). However, this should be paid with calculation complexity and with memorizing eigenvectors for this transform (KL is signal dependent).
Wavelet transform
Now, we can consider the Haar transform realized for N=8: HT N=2 HT N=2
y(0) y(1) y(2), y(3) y(4), y(5), y(6), y(7) We noted that in the lower branches there are no important components and we are not processing these branches further. Then the Haar transform matrix has large number of zero coefficients in transformation matrix.
x(n), n=0,...,7
HT N=2
h(n)
x(n)*h(n)
xh(2n)
g (0) = 2 / 2 g (1) = 2 / 2
h(n) has impulse response and 0 for other n
h(0) = 2 / 2 h(1) = 2 / 2
Haar wavelet
In upper branch we have samples:
n=[0,N/2-1)
For exercise create system that performs synthesis of input signal from this decomposition. In practice after wavelet decomposition we perform filtering of coefficients corrupted by noise or compression of signal with removing from visual quality point of view unimportant coefficients. Then output signal is different from the input signal.
We are interested in conditions that should be satisfied with filters h0(n), h1(n), g0(n) and g1(n), in order that input signal be the same as the output one. It is obvious that filters h0(n) and h1(n) should contain all frequencies in order that we can reconstruct signal at the output.
X ( z) =
n =
x ( n) z n
Check!
x ( n / 2) n = 0,2,4,... X u ( z) = X ( z 2 ) xu ( n ) = elsewhere 0
( z ) = 1 [G ( z ) H ( z ) + H ( z )G ( z )] X ( z ) X 0 0 1 1 2 1 + [G0 ( z ) H 0 ( z ) + H1 ( z )G1 ( z )] X ( z ) 2
it should be 0
Assuming that we know h0(n) and h1(n) equation for determination of the g0(n) and g1(n) in the Z-domain is:
H1 ( z ) G0 ( z ) 2 G ( z ) = det(H ( z )) H ( z ) 1 0 m
H 0 ( z ) H 0 ( z ) Hm ( z) = H1 ( z ) H1 ( z )
Assume that det(Hm(z))=z-(2k+1) (delay only) and for =2 we obtain:
g (k )h (n k ) + (1) g (k )h (n k ) = 2(k )
n
k =
This relationship represents the condition that should be satisfied by wavelet filters in time domain. In the next lecture we will conclude design of wavelet filters, some comments related to the 2D wavelets will be given and standard usage of wavelets explained.
For self-exercise
Write program for realization of the Hadamard and Walsh transforms. Determine inverse of the Walsh and Hadamard transforms. To which group they belong? Repeat the same procedure for the Haar transform. Assume that we have signal of dimensions 8x8 with non-zero coefficients x(1,2)=2, x(3,1)=1 and x(5,6)=1. Calculate the Hadamard, Haar and Walsh transforms of this signal. For NxM=8x8 determine basis images for (p,q)=(1,2), (p,q)=(3,3) and (p,q)=(6,7). Project for self-exercise. Application of the Hadamard transform in realization of the special purpose Hamming codes Reed-Muller codes. Write program for the Hadamard and other rectangular transforms of digital image. When image dimensions are not power of 2 realize transforms with zero-padded signal.
For self-exercise
Give hardware realization of the inverse Haar transformers. For filters used in the Haar transform (or Haar wavelet) determine spectral response. Since these filters have only 2 non-zero samples of the impulse response perform the zeropadding for obtaining clear results. Details of the eigenvalues decomposition and the KL transform in signal processing could be learnt in details from the book: For mini-project. Study singular value decomposition and cases when it can be used. Try to find proper references while wikipedia can be used as a starting point in search related to this topic.
M. R. Stoji, M. S. Stankovi, R. S. Stankovi: Diskretne transformacije u primjeni, Nauka, 1993 (strana 119).
For self-exercise
Write program that perform Wavelet Haar decomposition of the signal of the given length (power of 2) given number of times (for example 3 times) in such a manner that in each stage of decomposition it is performed on lowpass signal. Give graphical presentation of the signal decomposition using the wavelet transform. Wavelet decomposition is performed 3 times on the lowpass part. Give corresponding synthesis stage. Assume that in the lowpass branch of the wavelet we have ideal cut-off filter with cut-off frequency equal to half of the maximal frequency that corresponds to the sampling theorem. Determine filter for highpass branch and corresponding filters on the synthesis side. What is the main difference between this and the system with Haar wavelet?
For self-exercise
Prove formulas for Z-transform of the downsampling and upsampling circuit. Signal x(n) is transformed with downsampling and after that with upsampling circuit. Determine output of this system in time and Z-domain. Haar transform is determined. Try to determine methods for calculation of the Wavelet-Haar coefficients from the Haar transform. Haar transform and wavelet with Haar filter are defined with the same basic goal. Determine difference between these two transforms? Determine relationship between transform coefficients in the Hadamard and the Walsh transform? Mini-project. Consider several real images with their transforms (DFT, DCT, DHT and introduced rectangular ones). For each transform determine how many coefficients are kept in 50%, 70%, 80%, 90%, 95%, 98%, 99%, 99.5%, 99.9% of the signal power. Which of introduced transform is the best in each considered case?
For self-exercise
Repeat previous experiments when image is divided in blocks 2nx2n. Consider possibility to remove transformation coefficients that have small energy in function of n and used transforms. Can you make some consistent conclusion? Apply the wavelet-Haar decomposition on various images and try to repeat analysis from the previous two problems in this case form application of this transform in data compression. Mini-project. Study wavelets starting from the continuous time form. For this research start from the textbook.
G0 ( z ) H 0 ( z ) + G0 ( z ) H 0 ( z ) = 2
G0 ( z ) H 0 ( z )G1 ( z ) + G0 ( z ) H 0 ( z )G1 ( z ) = 0
k =
g 0 ( k )h0 ( n k ) + ( 1) n
k =
g ( k ) h ( n k ) = 2 ( n )
0 0
x(k ) y (k ) =
x, y
Inner product.
g1 ( k ), h1 (2n k ) = (n )
g0 (k ), h1 (2n k ) = 0
g1 (k ), h0 (2n k ) = 0
We can summarized these relationships as: the wavelet filters should satisfy the following relationship in the time domain: hi (2n k ), g j ( k ) = (i j )( n), i, j = {0,1}
holds we have orthogonal wavelets (the same type of wavelets used in analysis and synthesis stages). Next we will discuss the realization of the wavelet transforms for 2D signals and after that we will give some program realizations.
2D Wavelets
How to perform wavelet decomposition for 2D signals? In the same way as a generalization of all 1D transforms to 2D transforms we can perform operation separately along rows and columns of images. Therefore, we will perform wavelet transform along columns (or rows) followed by wavelet transforms along rows (or columns).
2D Wavelet transforms
h0(m) h0(n) x(n,m) 2 h1(m) 2 2
a(n,m)
h0(m) h1(n) applying filters to columns taking elements on even positions from each column 2 h1(m) applying filters to rows
2D Wavelet - Comments
a(n,m) is low-pass image (sometimes called coarse image). Notation a comes from the fact that it is product of analysis side of the wavelet transformers. In the next stages we can proceed with its analysis using wavelet transform. Images dV(n,m), dH(n,m) and dD(n,m) are called details in horizontal, vertical and diagonal directions respectively, These images are commonly not subject to further decomposition. Note that these images have dimensions N/2XN/2 and total dimension of data is not changed by this procedure. What is the advantage of wavelet transforms and how to realize it?
Wavelet realization
Very often instead of realization of the wavelet transform by using definition, numerous scientists use well-developed software tools. We will use the MATLAB Wavelet Toolbox but on the Internet it is possible to find alternative tools and MATLAB toolboxes. This is quite comprehensive and complicated toolbox and we will not consider all its possibilities but we will just concentrate on several features that can illustrate our theory in step-bystep manner. Note that there are single step functions in MATLAB toolbox able to perform very sophisticated operations. User should only learn how to use these functions and knowledge about wavelets is not required. This is popular for usage but it is source of numerous mistakes in practice.
Wavelet Toolbox
The most important recent contributions to the Wavelet Transform is given by Donoho and it is possible to find all MATLAB functions related to his research on his web-site. For a brevity we will present the wavelet transform in step-bystep manner without giving all details. Only on several places we will describe useful shortcuts that can be used in procedures. The first useful function has a format: [LD,HD,LS,HS] = wfilters('haar') Output of this function are wavelet function used in decomposition (analysis) stage LD and HD (lowpass and highpass filters) as well as those used in synthesis stage LS and HS. Argument of this function is used wavelet type. Here we use the Haar wavelet but there are numerous alternative predefined wavelet functions.
Wavelet filters
Instead of abrupt Haar wavelets we can use more smooth functions such as the Daubechies class of wavelet functions called dbn where n is integer that can be selected within range from n=1 to n=45 (db1 is in fact the Haar wavelet). Here, it will be used db23. Analysis stage can be performed using convolution of signal with corresponding wavelet functions: clear Test signal x=wnoise(1,10); [LD,HD,LS,HS]=wfilters('db23'); Filtering with corresponding filters xl=conv2(x,LD,'same'); (parameter same is introduced in xh=conv2(x,HD,'same'); order that obtained signals have
conv for 1D convolution has no argument same the same length as an original signal the first argument. Unrequired part of the signal is truncated).
Image reconstruction
Reconstructed Cameraman based on 17% of wavelet coefficients. Due to the described properties the wavelet transform has numerous application. Some of them will be given on next slides.
Application in denoising
Assume that image is corrupted in channel or during acquisition process (for example medical images obtained using ultrasound devices could be very noisy). It can be proved that noise in various wavelet coefficients is (almost) mutually independent and it is uniformly distributed (for relatively large number of decomposition steps) on all wavelet coefficients. Then we can perform filtering independently for each coefficients. Two groups of techniques are developed using wavelets: hard thresholding (all coefficients below the threshold are set to zero and in reconstruction are used for reconstruction (threshold is determined based on estimated noise variance)); soft thresholding (small wavelet coefficients are attenuated proportionally more than large ones). Obtained wavelet-based results are the last word of science in denoising applications. It is quite difficult to be better than wavelets in this area in terms of quality of imaging and speed.
Wavelet toolbox
There are additional wavelet applications. Check wavelet toolbox and demo program wavedemo. There are single step functions in the toolbox for signal filtering. For example, check function wden with its parameters. Note that this function is used for the 1D denoising and also see denoising 2D images.
Drawbacks of wavelets
When signal subject of transform is dominantly on low frequencies with only details and edges of high frequencies the wavelet transform is close to ideal analysis, synthesis, filtering and compression tool. Digital images are commonly this signals of type and today wavelet transform is quite commonly used in the digital image processing field. However, when relatively large energy is concentrated on high frequencies or when we have abrupt changes of frequencies in the signal, wavelet transforms are not ideal tools.
These signals are not quite common in digital images but they exists. One of these signals are optical intereferograms commonly caused by mistakes in design of optical communication channels or due to diffraction in optical microscopy (optical or laser microscopes are cheaper than electronic counterparts but they suffer from several drawbacks including interferograms). Inteferogram: test example. Wavelet transform is not suitable for analysis and filtering of such signals.
Optical interferograms
TF (S/SF) Transforms
Time-frequency transforms are designed for analysis of 1D signals with high frequency content. The most important TFRs are:
Short-time Fourier transform (STFT); Wigner distribution (WD).
For multidimensional signals, multidimensional generalization of these transforms are used (sometimes called Space/Spatial-Frequency representations). We described several representations from this group in the textbook and here from the brevity reasons we will skip them.
Image filtering
The most important part of our course is related to image filtering and reconstruction. Under filtering we assume obtaining some image features (for example in communications signal carrier of information modulated on some frequency). Filtering in digital images is commonly assumed to be denoising operation. The goal of denoising process is obtaining signal as close as possible to original image. In addition we will consider in a brief manner image reconstruction for images corrupted by noise and distorted with some deterministic form of distortions.
Lowpass filtered image is almost non-usable while highpass filtered counterpart is dark but recognizable (to be honest for better visualization we increased energy of highpass image).
This is the reason why filtering of digital images are quite complicated task and one of reasons for introducing the wavelet transform.
( n, m ) = 1 f (2) 2
filtered image
X ( , ) H ( , ) d d
1 2 1 2 1
Filtering can be performed for denosing purposed, for removing distortions and/or for obtaining some important features from digital images.
Lowpass filter.
High-frequency filter.
For exercise, determine stopband filter and filters with circle shape impulse responses. In addition determine filters with the shape of the Hanning window function in the pass part of the frequency response.
2D filter impulse response. Why we call this 2D filter impulse response and how it is connected with H(1,2)?
2D convolution.
Convolution or FFT
There are two alternatives for determination of the linear space invariant systems outputs: convolution or FFT in frequency domain. What to use?
The simpler technique.
At the first glance it is convolution in the space domain but in depth analysis can show that it is not the case in general. Let image x(n,m) has dimensions NxM and let impulse response h(n,m) has dimensions N1xM1. Convolution result has dimensions (N+N1-1)x(M+M1-1). Why?
Convolution or FFT
Under assumption that impulse response has smaller duration than image for filtering using convolution is required: (N+N1-1)x(M+M1-1)xM1xN1 additions and the same number of multiplications. Number of operation for
Number of pixels at the filter output. single pixels.
For the FFT-based realization it is required 3 FFTs (for filter response, input signal and inverse for output signal). These operations require: (N+N1-1)x(M+M1-1)log2(N+N1-1)x(M+M1-1)
All signals are properly zero-padded in order to avoid aliasing.
Convolution or FFT
In the FFT-based realization we need additional (N+N1-1)x(M+M1-1) multiplications X(1,2)H(1,2). In order that realization using convolution is simpler it is required to hold: (N+N1-1)x(M+M1-1)xM1xN1 < 3(N+N1-1)x(M+M1-1)log2(N+N1-1)x(M+M1-1)+ (N+N1-1)x(M+M1-1) After dividing with (N+N1-1)x(M+M1-1): M1xN1 <3log2(N+N1-1)x(M+M1-1)+1 Taking M1=N1 and N=M and N1=N we obtain.
Convolution or FFT
2N2<6log2(N+N-1)+1. Take for example N=100. The previous non-equality holds for: <0.065, i.e., for dimensions of the impulse response smaller than 7x7. To conclude that direct calculation of the convolution is better when impulse response of the filter is relatively small. Note that all derived formula are approximations but they reflect actual situations. In practice the FFT of the filter impulse response could be calculated in advance and stored in memory also we assumed that operations in direct evaluation of the convolution are performed for complex valued functions.
Blur filters
The aim of Blur filters is to blur images. The most important blur filter is the Gaussian blur filter. Its primary aim is giving depth to images. Assume that we have image of the person in front of some background. This filter can be applied to background. Then due to given depth to the background it could look like that background is very far from the person. The impulse response of the Gaussian blur filter is:
h ( n, m ) = exp((m + n ) / ) exp((m 2 + n 2 ) / 2 )
2 2 2
n m ,( n ,m )D
Domain defined in such manner that it is symmetric around the origin (0,0).
Original image.
The simplest way to simulate the motion blur filtering along alternative directions is by rotating impulse response for a given angle and convolving image with rotated impulse response.
Laplace filter
+1 1 h (n, m) = +1 +1 1 1+ 4 +1 1 1+ 1+ 1 1+ 1+
Impulse response of the Laplace filter. Defined for [0,1]. For =0 the impulse response exhibits: 0 1 0 h1 (n, m) = 1 4 1 0 1 0
This filter is used in edge detection and for detection of abrupt changes in luminance.
Image sharpening
The most important filter for sharpening of digital images is the unsharpen mask. The impulse response of this filter for dimensions 3x3 is defined using the Laplace filter as:
0 0 0 h( n, m) = 0 1 0 h ( n, m) 0 0 0
We should be very careful in this operation since for large amount of sharpening it can appear the so-called grain noise.
Convolution filters
Convolution filters (we will learn some additional filters from this group) represents ground for numerous algorithms in image processing. At least 50% filters used in practice and in commercial tools for image processing are based on these filters. They are also important segment of highly specialized tools for image processing. In addition there are some other filters that will be taught next week that do not belong to this group.
For self-exercise
Study of wavelet toolbox including:
Single-step filters; Decomposition and decomposition tree; Compression algorithms etc. Write report on 5 pages.
Write missing commands required for creating images on slides 14-16. Study soft and hard thresholding techniques for wavelet transformation and write programs for filtering images in arbitrary number of decomposition levels using the wavelet transform. Find and study data about JPEG 2000 algorithm. The Gaussian noise with variance 2 is input in the wavelet filter. Determine threshold to minimize the amount of this noise at the filter output.
For self-exercise
Write report about time-frequency representations and their applications in image processing. Solve problems related to the time-frequency representations from the textbook. Determine the impulse noise for convolution filters given in spectral domain and spectral response for filters given in time domain. MATLAB realization of convolution filters. Study menu Filters in the Photoshop. Which filters from this menu can be realize using the convolution and try to write your version of these filters. Study options related to mask (selection) in the Photoshop. Describe what represents mask filtering. User wants to apply filter on part of image. How he can do it? Which problems arise on limits of this image part and how we can overcome them?
Signal Denoising
Probably the most important application of digital filters is denoising (removing of additive noise from signals). The denoising problem can be described in the following manner (here given for 1D signals):
Signal of interest is f(n). It is corrupted by white additive noise (n), x(n)=f(n)+(n). Our goal is to get filtered signal s(n) as close as possible to original signal f(n) based on noisy observations x(n).
Signal denoising
As a similarity measure between f(n) and s(n) we will use mean squared error: N is total signal duration
1 N MSE = E{[ s (n) f (n)] } = [ s (n) f (n)]2 N n=1 Under certain assumptions such as to know the probability density function we can construct filters that minimize mean squared error. They are called ML filters (Maximum likelihood).
2
ML filter
Minimization problem for defining ML filters is:
s( n ) = arg min
n+ N
In this case 2N+1 is width of symmetric local neighborhood around instant n for which we want to determine filter output. General rule: wider window (larger N) better removing of noise but with signal disturbances. argmin means: value of that minimizes. F() is called loss function (term is coming from economics) and for ML filter it is equal to F()=-logp() where p() is probability density function of noise.
k =n N
F ( x(k ) )
F () = log 2+ | |2 / 2 2
MA filter
The output of the filter for the Gaussian input noise follows from:
J () =
n+ N k =n N l = m M
m+ M
[ x(k , l ) ]2
that minimizes this expression can be obtained by calculating the first derivative of J() with respect to : J () n+ N m+ M = 2[ x(k , l ) ] = 0 k =n N l =m M
n+ N k =n N l =m M
m+ M
x(k , l ) =
n+ N
k = n N l = m M
m+ M
MA filter
n+ N m+ M 1 s ( n, m ) = = N l =M x(k , l ) = (2 N + 1)(2 M + 1) k =n m N M 1 = N l = M x(n + k , m + l ) (2 N + 1)(2 M + 1) k =
It can be noted that this filter called MA filter (moving average) is convolutional filter that can be realized using convolution of image with impulse response: What is this?
Determine the frequency response of MA filters. Are they lowpass or highpass filters?
Drawbacks of MA filter
We already noted that MA filter has simple, semi-intuitive form, but that it can disturb edges and details of images. This filter can be realized in simple manner. However, image is not subject only to Gaussian noise but also some other noises that can have rare but very strong amplitudes are quite common in this area. This type of noise is called impulsive noise. MA filter is very sensitive to this noise and single pixel with impulse noise can be spread using the MA over neighbor pixels with decreased amplitude.
Assume that this is an image and that we want to filter out it with 3x3 MA filter. Red color denotes impulse that corrupted image. It has significantly different value of other adjacent pixels. Filter out inner 3x3 pixels. Several neighbor pixels has luminance significantly increased due to averaging impulsive (corrupted) pixel. MA filter in fact enlarges zone of impulse influence.
4.11 4
Comparison of pdfs
p()
Probability density function of Gaussian noise is smooth bellshaped curve having fast decay to zero meaning that there is high probability that underlying process would have extremely large values. Probability density function of Laplace noise is typical representative of impulse noise class since it has long tail with small but finite probability of taking very large values.
Probability density function of Laplace noise is proportional to exp(-||) with corresponding loss function F()=||.
MEDIAN FILTER
Median filter is ML filter for signal corrupted with Laplace noise. Derivation begins with:
J () =
n+ N k =n N l = m M
m+ M
| x(k , l ) |
MEDIAN FILTER
Sum of signs is equal to zero when number of positive signs is the same as the number of negative signs. This can be understood in an alternative manner. Sort x(k,l) for k[n-N,n+N] and l[m-M,m+M] into nondecreasing order. Denote sorted values as x(i) where: x(1)x(2)...x(i-1)x(i) ... x(2N+1)(2M+1) Median is value in the middle of sorted sequence x((2N+1)(2M+1)+1)/2. This value is those for which exactly half of samples produces x(k,l)- less than zero (has sign function equal to -1) while half are greater than 0 (has sign +1).
Assume that we have image and we want to filter it with median filter of dimension 3x3. Red represents impulse that is significantly of different value than neighboring pixels. Perform filtering of inner 3x3 square. Obviously, impulse influence is removed from the pixels of filtered image.
For filtering of this pixel we have a sequence with 5x1, 3x2 and 1x3.
1 MSE = E{| x(n, m) f (n, m) | } 2 (2 N + 1)(2 M + 1) Noisy image F "() p ()d Non-noisy (original) image
2
( F '()) 2 p ()d
For exercise perform the following experiments: check the mean squared error for Gaussian noise and median and MA filters; then repeat this analysis for Laplace noise and finally for Cauchy noise. Cauchy noise is model of the impulse noise with probability density function: p()=a/(2+a2). Conclusions derived from this analysis? Suggested conclusions: median is slightly worse than the MA filter for Gaussian noise, slightly better than the Laplace noise, and much better than the Cauchy noise. Median filter performs relatively good for impulse noise.
The drawback of the median filter is complexity and the fact that very small details in images are treated as impulses and eliminated. In order to keep small details size of the median filter is small commonly not larger than 9x9 and very often just 3x3.
y=uint8(y);
Filter comparison
Image corrupted by Gaussian noise and filtered with moving average filter and median filter. It is difficult to notice significant difference in image quality. Image corrupted by Laplace noise and filtered with median filter and moving average filter. Median filter produces significantly better quality.
Test noises
Noise for testing filters can be crated in various manners. For example Gaussian noise can be obtained as s*randn(N,M) where NxM is image size while s is the standard deviation of noise. Image that is corrupted by noise should be returned to the image format before visualization: truncated (or rounded) noninteger parts and given within limits between 0 and 255 for image presented with 8 bits per pixel. Alternative function is: B=imnoise(A,tip,parametri); A is original non-noisy image, tip is type of noise for example salt & papper for typical impulse noise, while parametri are noise parameters (in the salt & papper it is percentage of noise).
Generation of noise
In the textbook, it can be found details about Laplacian noise generation based on the uniform noise generator (available in MATLAB with function rand). In the recent MATLAB versions there are numerous realized noise environment. For example poissrnd realizes multiplicative Poisson random noise with mean value and standard deviation equal to the luminance of non-noisy image. Ultrasound images are subject to this kind of noise. Search for other random noise generators can be performed by using the MATLAB help system and option search entering phrase Random numbers generator. For testing different algorithms and techniques it is required to generate different noises and sometimes MATLAB functions are not available or they are not generated on machines where technique should be implemented (for example medical devices) so we have to realize them from the scratch.
SNR = 10log10
Given in the dB.
Original image.
Filtered image. Mean squared filtering error (Mean Squared Error). It is also used as quality measure.
Quality measure
The most popular technique for image filtering is the pseudo signal-noise ratio (PSNR) defined as:
Maximal signal value (maximal luminance).
PSNR = 10log10
2 fmax
1 M N [ f ( n, m ) f ( n, m )]2 MN m =1 n =1 Excluding some non-realistic situations (for example dark image with single bright pixel) PSNR is very good quality measure for filtering and other image processing algorithms.
Quality measure
For PSNR>60dB difference between two images are very difficult for observations even in the case of direct comparison. For PSNR>40dB (and mostly for PSNR>35dB) difference can be observed by comparing original and filtered image. For 15dB<PSNR<35dB image can be recognized but it is of lower quality (close to lower bound of very bad while on the upper bound relatively good). For PSNR<15dB (or PSNR<20dB) image is not usable. These bounds are one of reasons for wide usage of this measure in practice.
As all other measures where the logarithm is used this is also given in decibels dB. It represents ratio between MSE for input signal and for output signal, i.e., improvement achieved with our filter. For ISNR=0dB filtering does not improve image quality.
L - filters
We can apply the median filters since they produce relative accurate results. Better results can be obtained using the ML filter for particular noise type. However, the ML filters for mixed noise case cannot be written in closed form and their realization requires iterative procedures. Instead of the ML filter design we are using approximation called L-filters that combines properties of the MA and median filters.
L-filters - definition
L-filter is abbreviation of the linear combination of order statistics. Assume that we have image and that we consider pixel (n,m) with symmetric local neighborhood of size (2N+1)x(2M+1). In the case of the median filter we sorted pixels from the neighborhood into non-decreasing sequence: x(1) x(2) x(3) ... x[(2N+1)(2M+1)+1]/2 ... x[(2N+1)(2M+1)-1] x(2N+1)(2M+1) Values x(i) satisfy x(i)x(i+1). Median is equal to: x[(2N+1)(2M+1)+1]/2.
L-filters - Definition
L-filter can be defined as:
y ( n, m ) =
(2 N +1)(2 M +1)
i =1
ai x((in),m )
L-filter coefficients.
Coefficients ai commonly satisfy the following properties: (2 N +1)(2 M +1) ai = a(2 N +1)(2 M +1)+1i ai = 1 Un-biasness condition that take
i =1
Condition that keeps energy (luminance) of the output signal to the same energy as in the input signal.
values from on the same distance from the median with the same weights.
[0,0.5]
For self-exercise
Create functions for MA and median filters for non-rectangular neighborhood. Create function that realize the -trimmed mean with as an input argument.
Hint. Realization is close to median. We perform sorting (function sort in MATLAB), and after that perform averaging of central values in the sorted sequence (depending on ).
Realize the L-filter described in lower part of the slide 45. Realize the myriad filter for local neighborhood of given size. Myriad filter is the ML filter for Cauchy noise that has probability density function proportional to s/(2+K2) where K is the socalled linearization parameter.
Hint. This filter output cannot be represented in the closed form and we need to develop some iterative algorithm. Initial iteration can be output of the MA or the median filter.
For self-exercise
Real image is corrupted by the Gaussian noise. Can the mean squared error of the output in the case of the MA filter be larger than in the case of the median filter? Why? Repeat the same analysis for the images corrupted by the Laplacian noise and filtering with these two filters. Can the ML for this noise (median) produce worse results than the MA filter and why? Create separable and recursive median filter. Compare the separable median, median, -trimmed mean and MA filters for filtering of the salt&papper noise, Gaussian noise and Laplacian noise.
Hint. Experiment can be performed as follows: consider several images, consider different noise levels, for each level repeat simulations for various noise realizations (Monte-Carlo simulation), consider different filters parameters (size, , etc), various noise parameters and try to make some conclusions.
For self-exercise
Determine spectral characteristics of the median filter. Determine the spectral characteristics of the MA filters (MA filters in frequency domain). Does the L-filter defined in the lower part of the slide 45 follows the common requirements of the L-filters presented on slide 42. Consider a noise with uniform probability density function on interval from to . Define the ML filter for this noise. Can we define the filter from the L-filter class to filter out this noise? Based on which two special filter types from the L-filter class we can filter-out this noise? Noise has with probability (1-p) the Gaussian pdf with mean 0 and variance 2 (it is sometimes denoted as N(0,2) where N comes from term normal pdf) and with probability p (p is relatively small commonly below 10%) Gaussian pdf with mean 0 and variance K22 where K is greater than 1 and commonly it is 3 or 5. Determine corresponding ML filter for this noise. Can it be given in closed form? Is, for this noise type, better the MA or median filter?
For self-exercise
Realize the ML filter for mixed Gaussian noise (probably you will need some iterative procedure). Compare it with the median and the MA filter. Are obtained results in line with expectations and why? Median filter takes as its output arbitrary pixel from the neighborhood. We want to keep original pixel if it is not corrupted by the impulse noise. There are modification of the median filter called weighted median where in sequence used for sorting, central pixel is repeated several times in order to increase probability that it is output of the filter. Realize this filter and compare results obtained with various number of repetitions? Rayleigh noise is equal to the square root of the sum of squares of two Gaussian noises. It is common in practice. Can median, MA and other introduced filter forms be effectively used for filtering of this noise? What is the ML filter for this noise? How to modify the L-filter or median filter in order to have acceptable results for this noise environment?
For self-exercise
Create functions for calculation of PSNR, SNR, and other quality measures (based on original and noisy or filtered image. Create function for the ISNR evaluation. Can result obtained with the ISNR be negative and what is conclusion that we can drawn based on this result?
Wiener filter
Assume that the filtering goal is minimization of the mean squared error:
1 MSE = NM
N M
N M
Easy! We should calculate partial derivatives of the MSE along h(k,l) and solve system of equations for these derivatives equal to zero. I will skip several steps in this procedure (try it yourselves).
Wiener filter
The obtained set of equations could be written as:
h(n k , m l ) R
k l
yy
(k , l ) = Rxy (n, m)
Ryy (k , l ) = E[ y (n, m) y (n k , m l )]
Rxy (k , l ) = E[ x(n, m) y (n k , m l )]
Convolution is obtained on the left-hand side of this equation. Then we can apply the 2D DFT on both sides of equation.
Wiener filter
The calculation of 2D DFT gives:
Wiener filter
The considered filter can be written as:
This filter is called the | X (1 , 2 ) |2 H (1 , 2 ) = Wiener filter. 2 2 | X (1 , 2 ) | + | N(1 , 2 ) | | Y (1 , 2 ) |2 | N(1 , 2 ) |2 H (1 , 2 ) = | Y (1 , 2 ) |2
| N (1 , 2 ) |2 = 1 | Y (1 , 2 ) |2
Unknown Known
For determination of the filter response function in spectral domain we need to have information about noise spectral power, i.e., to know parameters of noise.
1 | Y (1 , 2 ) | > T H (1 , 2 ) = 0 | Y (1 , 2 ) |2 T
2
This simplified form produces very often quite accurate results that are sometimes better than those produced with some sophisticated methods.
Wiener filters are implemented in various equipment since we can know in advance type of disturbance that is common for that machines. For example, it is known that in ultrasound imaging we are dealing with Poisson noise and we can know parameters of this noise in advance. The Wiener filters are popular since they are linear and they can be efficiently realized in real-time.
Inverse Filtering
Wiener filter in this case is:
| N(1 , 2 ) |2 H w (1 , 2 ) = 1 | Y (1 , 2 ) |2 H w (1 , 2 ) X (1 , 2 ) = Y (1 , 2 ) D (1 , 2 )
denotes approximately Inverse filter.
This filter is estimator of X(1,2)D (1,2). When we know D(1,2) then X(1,2) (2D FT of original image) can be calculated as:
for for
| D ( k1 , k2 ) |> | D ( k1 , k2 ) |
Adopted parameter.
Definition of median
We considered instant n and local neighborhood [n-K,n+K]. Then output of the median filter is calculated as a value that minimizes:
J () =
n+ K k =n K
| x(k ) |
This can be assumed as a value from the set {x(k)|k[n-K,n+K]} with minimal sum of distances to all other points from the set.
Vector median
Luminance of color-pixel can be written as a vector x(n,m)=(r(n,m),g(n,m),b(n,m)). Output of the vector median for local neighborhood (here we give 2D local neighborhood) is from the set {x(k,l)|(k,l)[n-K,n+K]x[m-L,m+L]} having the smallest sum of distances to all other points from the set
J () =
n+ K
k =n K l = m L
d (x(k , l ), )
m+ L
Distance in 3D space
For distance function defined as:
d ( x1 ( n, k ), x 2 ( n, k )) = ( r1 ( n, k ) r2 ( n, k )) 2 + ( g1 ( n, k ) g 2 ( n, k )) 2 + (b1 ( n, k ) b2 ( n, k )) 2
output is MA filter for channels of image. For Euclidian distance:
d ( x1 ( n, k ), x 2 ( n, k )) = ( r1 ( n, k ) r2 ( n, k )) 2 + ( g1 ( n, k ) g 2 ( n, k )) 2 + (b1 ( n, k ) b2 ( n, k )) 2
we obtain vector median filter output.
For self learning realize the vector median filter, consider its numerical complexity and compare it with the marginal median filter.
Vector L-filters
The Euclidian distance is just one possible method for evaluation color differences. Then, alternative vector median forms can be developed. How to realize the vector L-filters for color images? It is relatively simple technique. It should be sorted pixels from the local neighborhood according to sum of distances between colors. The output of the L-filter is average produced by using several pixels having the smallest sums. For self-exercise realize this filter. There are numerous variants of the adaptive vector filters.
Tool
We created Tool for non-linear image filtering (for grayscale and color images) that realizes introduced and also numerous other filters. Students can download version 1.02 of this tool. Major revision is under way and it should produce more working comfort and increased computation efficiency.
Pseudocolors
Determination of pseudocolors is procedure where grayscale image (image without color information) is replaced with color version. The first idea is to try to get color image of old photos but commonly under pseudocoloring is assumed alternative operation. This operation is commonly performed when we want to get mode realistic image in color than in grayscale, i.e., image based on grayscale where some details can be more easily observed than in grayscale images (here, property of human vision that it is more sensitive to colors than on luminance is used).
Pseudocolor applications
Pseudocolors are used in video surveillance systems (X-ray systems at airports) where security clerks are trying to find explosives, drugs and other goods subject to smuggling. The most important set of pseudocolor applications is in medicine. For example, small blood vessels are searched in tissue that produces similar luminance, ultrasound imaging of kidney, color ultrasound systems, are just part of applications in medicine.
Pseudocolor methods
Here, we will explain three techniques for obtaining pseudocolors. The first technique is the simplest. RGB color is given to any shade of grayscale. This is similar to the color map described earlier but only here the grayscale is assumed to be represented with corresponding color map. Companies that produce equipment study different color maps and try to find those that produce the best results. In fact buying this product you are buying the colormap. Similar technique is situation where we are adopting several levels of luminance and values between two levels we represent with one color while between other levels with alternative color.
Spectral domain
Assume the original image f(n,m), than red, green, and blue channels can be obtained as: CR(n,m)=f(n,m)*n*mhL(n,m) CG(n,m)=f(n,m)*n*mhB(n,m) CB(n,m)=f(n,m)*n*mhH(n,m) Where hH(n,m), hB(n,m) and hL(n,m) are impulse responses for highpass, bandpass and lowpass filters. These filters in spectral domain can be defined as:
0 D(1 , 2 ) DH H H (1 , 2 ) = 1 D (1 , 2 ) > DH
1 D (1 , 2 ) DL H L (1 , 2 ) = 0 D (1 , 2 ) > DL
What we are getting buying equipment with pseudocolors in spectral domain? Companies sell to us parameters DL, DH, modifications related to function D(1,2) as well as different filters than here described cut-off filter. Finally, the most important are weights associated with corresponding channels. Research related to the pseudocolors is assembled together with sensors and other devices and sold as a machine.
Example
Pseudocolors in security systems is very important issue. Here image of the suitcase obtained using X-rays is given with its pseudo version.
Red rectangular of the right image represent the TNT explosive. This material is very difficult for examination from the grayscale. Since this pseudocolor image is dominated with red and blue colors we can learn that for imaging is used some technique in spectral domain.
Halftoning - Dithering
Halftoning and dithering are strategies for printing images in continuous scale to printers that have single color. Here, drawback of the human eye that it recognizes several dots on small distance as a color is used. Then when we want to print black we put more dots while when we want to print gray we put less dots while when color close to white is printed very small number of dots is used.
Halftoning
Halftoning is simpler strategy for obtaining discrete tones from continuous ones in order to produce image suitable for printing. Behind this strategy is quite simple idea. One pixel of original image represents x black-white spots on printer (standard for text in Europe is 8x8 or 10x10 while in Japan Assume that luminance can be within limits of [Amin,Amax].
and some other Asian countries it is 12x12 in order to allow reproduction of kanji Chinese characters).
Halftoning
Region of possible luminance can be divided in x+1 in the following manner: i-th region is [Amin+(i-1)A, Amin+iA) where A=(Amax-Amin)/(x+1) and i[1,x+1]. The first region (i=1) of luminance represents the darkest region of the image and we give to it x black dots. Any further region has smaller number of dots with respect to previous one. For example i-th region has xi+1 black dots. We can illustrate this on the example with zone of 2x2 pixels and for luminance within [0,255].
Halftoning - Example
4 black Luminance [0,51) Luminance [51,102) 3 black 2 black Luminance [102,153) Luminance [153,204) 1 black Luminance [204,255] without black
Halftoning is simple procedure but it has a significant drawback that limits its applications. Note that technical details of this procedure can be represented in slightly different form.
Halftoning example
Assume that image in larger zone has uniform luminance and that this represent case with two black pixels.
Here is given enlarged image of this pattern. Humans cannot distinguish black and white dots on small distance but we are able to recognize periodic patterns even when they are quite small. The biggest problems are lines and this image could look like:
Dithering
Dithering is more complicated methodology for obtaining binary image from grayscale that includes some additional (stochastic) elements. Halftoning can be assumed as truncation of colors algorithm. In dithering we want to reduce and distribute errors that is produced by process of obtaining binary from continuous scale image (or close to continuous scale). Here we are using dithering matrix that is commonly defined using recursive relationships.
Dithering matrix
Dithering matrix has dimensions nxn and it can be obtained using matrix of dimensions (n/2)x(n/2):
2 4 D n / 2 + D00U n / 2 Dn = n / 2 4 D + D120U n / 2 2 4 D n / 2 + D01U n / 2 2 n/2 n/2 4 D + D11U
matrix of smaller dimensions matrix of dimensions (n/2)x(n/2) with all ones (ones in MATLAB) Element on position (i,j) in dithered matrix of dimension 2x2. Here index begins with 0.
LM0 2OP = N 3 1Q
Initial matrix.
Dithering matrix
Dithering matrix 4x4 is (check it):
LM 0 8 2 10OP 4 6 MM12 11 14 9 PP D = 3 1 MN15 7 13 5 PQ The most common dithering matrix has dimension 8x8.
4
Determine it for self-exercise. Obtained dithering matrix is periodically repeated in both directions until entire image is covered.
1 g (k , l ) = 0
f (k , l ) > T (k , l ) elsewhere
where threshold is: T (k , l ) = ( Amax / Dmax ) D n (k mod n, l mod n)
original image
We performed experiment with small number of pixels of original image in order to results of experiment be more obvious. Careful look at halftoned image can even tell to us info about shape of binary font.
For self-exercise
Create the Wiener filter for image corrupted by the Gaussian noise. Filter should estimate standard deviation of the noise using approximate relationship:
= median{| x(n) x(n 1) |, n [2, N ]}/ 0.6745 as a median of absolute difference between neighbor pixels. Note. This is 1D relationship. It should be adjusted to the 2D case. Realize inverse filter. Do you obtain expected results and if not try to determine reasons. You can find on Internet or from the lecturer more details related to inverse filters and their practical application. Consider methodology for determination of the motion blur parameters for motion in arbitrary direction. Try to include this information in inverse filter for motion blur.
For self-exercise
Inverse filtering and Gaussian blur. How to estimate parameters of the blur. Consider methods for realization and try to determine problems in the realization. Try to find additional references (textbooks, Internet and lecturer) where you can find more details how to overcome problems in realization of this filter. Realize marginal median, vector median, vector L-filter for color images. Consider realization of the toolbox for non-linear filtering especially realization of the vector filters for colored images. Try to find methods for reducing complexity of these algorithms. Realize technique for pseudocoloring described within lecture. Describe problems you have found in realization. Realize both techniques for halftoning (binary font and dithering).
For self-exercise
Find more data about the Floyd-Steinbergovom algorithm for dithering and try to realize it. Try to find more details related to color dithering and realize some of these techniques. Miniproject. One very important problem in practice is so called inverse dithering. It is technique that produce continuous scale image from binary one. This in fact filtering of binary image that produces non-binary output. Propose your technique for solving this problem and try to find something in available publications.
Importance of Edges
From the really beginning of the course we heard that there is simple rule in the images: low frequencies have significant amount of image energy but they are relatively unimportant from the point of view of visual image quality from both human and machine vision; high frequency region has small energy but it is very useful part from the point of human and machine vision. This high frequency region can be very prone to the noise influence. The most important part of the high frequency region are edges: object and image are recognizable just based on images; they can be recorded in the binary format and they can be efficiently processed; based on detected images we can easily adjust local neighborhood for filtering algorithms, etc.
First derivative
The first derivative is defined as: df ( x) f ( x + x) f ( x) f '( x) = = lim x0 dx x For the 2D functions derivative is determined as: f ( x, y ) f ( x, y ) f ( x, y ) = [ f x ( x, y ), f y ( x, y )] = , x y
Abrupt variation of luminance can be detected based on magnitude (amplitude) of derivative that can be calculated as:
Partial derivatives along x and y
e( x, y ) =
f x2 ( x, y ) + f y2 ( x, y )
The first noticeable problem is fact that we have no continuous image but discrete and that we need to use difference instead of derivatives.
f x (n, m) = f (n + 1, m) f (n, m)
The second significant problem is the fact that differences amplify noise.
Assume that we have image without image corrupted by Gaussian white noise with variance 2 and that noise is independent on position of pixels (non-correlated in neighbor pixels). Then, variance of difference along x-coordinate is:
=0 increased variance
= E{ 2 ( n + 1, m )} 2 E{( n + 1, m )( n, m )} + E{ 2 ( n, m )} = 2 2
Roberts detector
This very simple detector can be evaluated as:
e ( n , m ) = [ f ( n, m )
f (n + 1, m + 1)]2 + [ f (n + 1, m)
f (n, m + 1)]2
4 pixels are used for edge detector evaluation but without information about edge direction. In design of this detector it was used knowledge about human visual system and its ability to detect edges. This operation is assumed to be performed by eyes in process of edge detection but in practice results achieved with this detector are poor.
Mask
The most common group of edge detectors is based on mask. Mask is 3x3 matrix. Here edge detector response can be evaluated using convolution with this matrice. Two matrices are designed for edge detection along x- and yaxis. Rules for matrix selection.
Matrix used for detection along x-axis should be the same as transpose or rotated for /2 version of matrix used for detection along y axis. In this manner the same importance is given to both orthogonal directions. Here we will consider detection along x-axis with matrix:
a11 a 21 a31
Mask design
Central pixel is pixel of interest. We want to subtract pixels on position a12 and a32. However, we are using neighbor pixels to reduce noise influence. Then right and left pixels should be taken with the same strength and from these facts follows that a11=a31, a21=a23 and a31=a33:
a11 a12 a11 a a22 a21 21 a31 a32 a31 Pixels in bottom row are subtracted from pixels in top row and these coefficients should have opposite signs:
a11 a 21 a11
Mask design
For uniform image we want that edge detector produces result equal to 0. Then we need that sum of mask coefficients be 0:
Commonly edge detector used for detection along one direction should produce result equal to 0 for edge in normal direction. Then we set central position in mask equal to 0:
a11 a 21 a11
a11 0 a11
a12 0 a12
a11 0 a11
Mask design
Mask can be divided with a11 since it is unimportant multiplicative constant:
K 1 0 0 1 K
1 K 1
1 0 K 0 1 0
K=1Prewitt matrix K=2Sobel matrix The most important edge detector based on mask.
MATLAB Function
Matlab function for edge detection is edge. It is quite simple for using: E=edge(G,type,parameters); where G is a grayscale, type is used detector type (all described so far and some other roberts, prewitt and sobel) while parameters are parameters of considered detector. Function output E is binary image. Commonly the third parameter (it can be avoided and MATLAB would take some default value) is threshold i.e, within domain [0,1] (it can be also empty matrix []), while the fourth parameter for sobel and prewitt detector is direction of edges: horizontal, vertical or `both.
Direction of edge
Obviously, determination of the detection threshold is quite complex issue. Even threshold setup can be different for various detectors. There is a problem of estimating direction of edges. For the mask based detectors (Sobel, Prewitt etc) we have detector responses for x and y coordinates. There is possibility to estimate edge direction based on ex(n,m) and ey(n,m) as
e y ( x, y ) ( x, y ) = arctan ex ( x , y )
Kirch detectors
Kirch designed additional masks for detection of edges in other directions. For example detector that is sensitive to edges in directions of x, y-axes and axes that are giving angles of /4 and 3/4 with respect to main axes can be designed using 4 matrices:
1 1 1 0 0 0 1 1 1
1 0 1 1 0 1 1 0 1
0 1 1 1 0 1 1 1 0
1 1 0 1 0 1 0 1 1
Direction of edge is determined using the largest detector response produced with these matrices.
Kirch detectors
Kirch designed detector with eight matrices corresponding to eight various directions. 5 5 5 3 5 5 3 3 5 3 3 3 3 0 3 3 0 5 3 0 5 3 0 5 3 3 3 3 3 3 3 3 5 3 5 5
3 3 3 3 3 3 5 3 3 5 5 3 3 0 3 5 0 3 5 0 3 5 0 3 5 5 5 5 5 3 5 3 3 3 3 3
Laplace detector
The logic behind the Laplace detector is quite different than the mask-based detectors. The Laplace differential operator is defined as:
1 2 f (n, m) f (n, m) [ f (n + 1, m) + f ( n 1, m) + f ( n, m + 1) + f (n, m 1)] 4
Edge (abrupt change in luminance) can be detected as zero in the Laplace operator (position with zero-crossing).
If red is equal to 100 and green 50 Laplace detector response for pixel X is: 100-350/4>0 while for o: 50250/4<0 suggesting that zero-crossing (edge) is between these two pixels.
o X
n+ M
f ( k , l ) f ( n, m )
local neighborhood
m+ M
f (k , l )
The Laplace detector introduces threshold of variance. For small variance we assume that edge is not detected (image is almost constant) while for large variance and zero-crossing we assume that edge is detected. There is alternative that region where edges can be detected be calculated based on difference:
w( n, m) = max A{ f ( n, m)} min A { f ( n, m)}
Maximal and minimal luminance in local neighborhood
Laplace-Gauss detector
A technique for reducing noise influence in the Laplace detector is filtering of image or detector response with Gaussian filter (Gaussian blur). Both operations are linear and they can change order without affecting results. Obtained detector is in literature called LoG (Laplacian of Gaussian). Matrix in the Gaussian is given in the shape:
h(n, m) = exp((n 2 + m 2 ) / 2 2 )
LoG detector
The most common matrix for LoG detector is:
0 0 0 1 1 2 0 1 0 0 0 0 16 2 1 2 1 0 1 0 0 1 0 2 1
However, for various values of follows different matrix for LoG but given matrix is the most common in practice.
The Canny detector is a special form of the LoG detector assumed to be one of the best (but not the simplest one).
Canny detector
Three different filters are employed in the Canny detector realization. The Gaussian with impulse response:
h(n, m) = exp((n 2 + m 2 ) / 2 2 ) /(2 2 )
adopted constant
Filters with impulse response equal to partial derivatives of the Gaussian filter impulse response:
hx (n, m) = h(n, m) / n = n exp((n 2 + m 2 ) / 2 2 ) /( 2 ) hy (n, m) = h(n, m) / m = m exp((n 2 + m 2 ) / 2 2 ) /( 2 )
Canny detector
After this operation output of the Gaussian filter is filtered with filters having responses equal to derivatives of the Gaussian function:
ex (n, m) = g (n, m) *n *m hx (n, m)
Commonly instead of e(n,m) we present binary image obtained by comparing it with a threshold.
HUHUHUH!!!
When edge is not abrupt change in luminance but single line passing over a background in other color we can use matrices of the form:
1 1 1 2 2 2 1 1 1 1 1 2 1 2 1 2 1 1 1 2 1 1 2 1 1 2 1 2 1 1 1 2 1 1 1 2
Hough transform
Images are in some application transformed to binary form using edge detectors and after they are used for recognition of geometrical objects primitives. The simplest primitive object is straight line. It can be represented using linear function y=ax+b (or m=an+b). The main goal is to determine if pixels belong to line and to determine parameters of line (a,b). The procedure is not quite simple. Consider (x1,y1) that belongs to edge but we do not know if this edge is straigh line and if it is we do not know its parameters.
Rr0
/2>-/2
Limit of domain can be determined by using image dimensions.
r x
In this manner both parameters used for line parameterization have limited domain.
Results
Circle parameters
In the case of the circle parameterization we can use edge direction (angle). Namely, we can employ some edge detection algorithm that produces edge e(x,y) and edge direction (angle) (x,y). In the case of the circle the edge direction is normal to radius and search for the circle center can be reduced to 1D search along the line normal with respect to the edge direction. Unfortunately, edge direction can be determined up to limited accuracy and we cannot employ this idea in straightforward manner. Instead we will use segment of the circle in the search procedure.
For self-exercise
Realize all introduced edge detectors: Roberts, Sobel, Prewitt, Laplace, Gauss-Laplace, Canny detector. Realize described methods for mapping edge detector to binary image. Study methodology for determination of the threshold for various edge detectors (especially for Canny detector) in the MATLAB function edge. For mini-project: Consider edge detectors for color images using the Canny and Kumani operators. In this case, detectors are not evaluated separately along channels but based on channels combinations. For mini-project: Application of the edge detector in realization of adaptive filters.
For self-exercise
For mini-project: Application of color edge detectors in creation of the adaptive color filters. For mini-project: Realize detector of isolated point and detector of contrast lines with methods for threshold selection. For mini-project: Realization of the Kirch detector and possibility to determine edge direction more precisely than as multiple of 45 degrees. For mini-project: Advanced adaptive techniques for detection of edges using precise estimation of derivatives. Create faster version of the algorithm for the Hough transform. Hint: For detected edge point we can go along detected direction but not for all (x,y).
For self-exercise
Establish relationship between the Hough and Radon transform. For mini-project: SLIDE algorithm for edge detection. For mini-project: Improvements of the Hough transform. For mini-project: Compare the improved Hough transform and the SLIDE algorithm. Hint: References related to the SLIDE algorithm can be found from the lecturer. Realizujte algoritam za prepoznavanje krugova i krunih lukova. Za miniprojekat: Algoritam za prepoznavanje elipsi. Kako se pomou Houghove transformacije moe prepoznati poetak i kraj linije? Za miniprojekat: Realizacija algoritama za praenje ivica. Pogledati skriptu.