Computer Vision and Pattern Recognition
L11. Shape Representation (II)
Dr John Collomosse
[email protected]
Centre for Vision, Speech and Signal Processing
University of Surrey
Learning Outcomes
After attending this lecture, and doing the reading and
labwork you should be able to:
Describe how 2D moments can be used to represent shape
Understand how shape descriptors can be integrated into a standard
supervised classification framework for shape recognition
Describe basic morphological filtering operations like erosion, dilation,
opening and closure.
Construct and use chain codes
Discuss the invariance properties of different shape descriptors
Work with both region based shape descriptors and contour based shape
descriptors
Recall: Statistical Moments (1D)
Central moments are moments computed about the mean
Written 1 for short
th
In general, the central moment is:
2 is the variance of the data (measures “spread”= divergence from
mean)
3 is the skew of the data (measures how data is “balanced” about mean)
Statistical Moments (2D)
The 2D equation for a moment about zero is:
m10 – x component of mean
m01 – y component of mean
The 2D equation for a central moment is:
As before, 00 is by definition = 1
It is the 2D central moments that are mainly useful in shape description
Taking successive higher order moments decomposes a shape in a
conceptually similar way to Fourier analysis decomposing a signal
Useful applications of Moments
The orientation of a shape can be computed using ratio of moments
The seven Hu moments (affine invariant shape descriptors)
Originally proposed for character recognition
No, you don’t have to
remember these for the exam...
Hu Moments - Example
Computing Hu moments for this image under various transformations
Results in these values of the 7 Hu moments
Shape Descriptors
A shape descriptor is a feature representing the shape of a region or
contour.
Just like any descriptor (remember image descriptors?) a shape
descriptor should be disciminative and compact
So far we met two shape descriptors:-
Shape factors Moments
These were suitable for describing
regions
= region based shape descriptors
Later on we will also look at some contour based shape descriptors
What use are Shape Descriptors?
Shape descriptors give us a vector of numbers describing the shape
As with all features, we consider these numbers as a point in an n-D space
So each shape maps to a point in the space
Feature space
H7 H1
H2
In this example, each shape
H3
maps to a point in 7D space H4
H5
H6
H7
Shape descriptor
H6
H5
H4 H3
H2
H1
Shape Descriptors for Classification
Different shapes map to points in different parts of the space
H7 H1
H2
H3
H4
H5
H6
H7
Shape descriptor
H6
H5
H4 H3
Feature space
H2
H1
Shape Descriptors for Classification
If we model each class of shape we can build a supervised classifier
• Given a new shape, we calculate it descriptor (point in space) =
• The model it’s nearest to tells us the category of the shape
H7 H1
H2
H3
H4
H5
H6
H7
Shape descriptor
H6
H5
H4 H3
Feature space
H2
H1
Shape Descriptors for Classification
All the supervised classification techniques covered before, apply
here
• Can use Eigenmodels (or even SVMs) to classify
• Use a confusion matrix to evaluate
H7 H1
H2
H3
H4
H5
H6
H7
Shape descriptor
H6
H5
H4 H3
Feature space
H2
H1
Over to You
You win a military contract to design an air defence system
A camera will watch the sky and decide which kind of plane is overhead
MIG? or Stealth Bomber?
How will you do it?
Over to You - Solution
1) Separate the plane from the blue sky
- Use a colour classifier to decide if each pixel is sky or plane
2)Calculate a shape descriptor from the region
3)Feed the shape descriptor into a supervised classifier
( You need to first train the classifier with examples of each class of
plane that needs to be recognised )
Shape Descriptors for Classification
Plane spotting with shape descriptors
Is it a...
Bird?
Plane?
Superman?
H7
bird plane man
BIRD 1.0 0 0
H6
H5 PLANE 0 1.0 0
H4 H3
H2 MAN 0 0.5 0.5
H1
Cleaning up the binary mask
The shape descriptors we have looked at so far are based on regions
Regions are not always cleanly separable from the background
This creates noise in the binary mask
Binary Erosion
A simple form of non-linear filter can help - Erosion
0 0 0 0 1 1
Has speckles
of noise Structuring
0 1 0 0 1 1
Region
0
Element
0 0 1 1 1
1 0 1 1 0 1
Has unwanted holes
0 0 1 1 1 1
0 0 1 1 1 1
0 1 1 1 1 1
Binary Erosion
A simple form of non-linear filter can help - Erosion
0 0 0 0 1 1 0 0 0 0 1 1
0 1 0 0 1 1 0 0 0 0 1 1
0 0 0 1 1 1 0 0 0 1 1 1
1 0 1 1 0 1 Erode 0 0 1 1 0 1
0 0 1 1 1 1 0 0 1 1 1 1
0 0 1 1 1 1 0 0 1 1 1 1
0 1 1 1 1 1 0 1 1 1 1 1
Input Output
Count the number of ON pixels in structuring element. If > n, then
output is ON else it’s OFF.
Binary Dilation
Dilation is the “conceptual” opposite of erosion
0 0 0 0 1 1 0 0 0 0 1 1
0 0 0 0 1 1 0 0 0 0 1 1
0 0 0 1 1 1 0 0 0 1 1 1
0 0 1 1 0 1 Dilate 0 0 1 1 1 1
0 0 1 1 1 1 0 0 1 1 1 1
0 0 1 1 1 1 0 0 1 1 1 1
0 1 1 1 1 1 0 1 1 1 1 1
Input Output
Count the number of OFF pixels in structuring element. If > n, then
output is OFF else it’s ON.
Morphological Operations
Dilation and erosion are called “morphological operators”
Because they are non-linear they can’t be achieved via convolution
Combinations of dilation and erosion can be very useful
A dilation followed by an erosion is a “closure”
An erosion followed by a dilation is an “opening”
Erode Dilate
Opening
Binary Opening and Closure
These operations illustrate why dilation is not the inverse of erosion
The effect of one does not reverse the other, and this is useful!
Erode Dilate
Opening
Dilate Erode
Closure
Obtaining boundary of a region
A different combination of erosion and dilation can be used here too
Boundary(I) = Dilate(I) – Erode(I)
Contour based shape descriptors
The shape descriptors we have looked at
so far are based on regions
We compute them from connected
components in the image
We will now look at a way of
computing a shape descriptor from
the contour (boundary) of a region
Chain code
The chain (Freeman) code was invented in the early 80s as one of
the first contour based shape representations.
Chain codes can represent open or closed shapes
They can also be used for compression (e.g. 3x2 +1x3...)
Common uses for chain codes are in gesture recognition due to
powerful string matching algorithms that can be applied.
Creating a chain code
Over to you
Can chain codes represent all shapes?
As described, are they invariant to:-
• Translation?
• Rotation ?
• Scale?
Over to you
Can chain codes represent all shapes?
Yes
As described, are they invariant to:-
• Translation? Yes
• Rotation ? No
• Scale? No
Start-point Invariant Chain codes
We can character-rotate the chain code string to a known point, to
make it invariant to the start of the coding point.
Do this by computing the “first difference” of a chain code
Difference of adjacent points, anti-clockwise –
e.g. CC=10103322 FD=3133030
Try every rotation of the chain code to find smallest integer FD
Note this is not the same as rotational invariance.
Summary
After attending this lecture, and doing the reading and
labwork you should be able to:
Describe how 2D moments can be used to represent shape
Understand how shape descriptors can be integrated into a standard
supervised classification framework for shape recognition
Describe basic morphological filtering operations like erosion, dilation,
opening and closure.
Construct and use chain codes
Discuss the invariance properties of different shape descriptors
Work with both region based shape descriptors and contour based shape
descriptors
Further Reading
“Feature extraction and image processing”
-pp.103 – (Morphological ops e.g. Erosion, Dilation)
-pp. 311 – (Moments)