0% found this document useful (0 votes)
16 views120 pages

Local Invariant Features

The document outlines a lecture on Local Invariant Features in Computer Vision and Image Processing, focusing on techniques such as Harris corner detection, blob detection with Laplacian of Gaussian (LoG), and the SIFT algorithm. It discusses the importance of local features for multi-view matching, including their desired properties like repeatability and distinctiveness. Additionally, it covers the mathematical foundations of corner detection and blob detection, emphasizing the role of geometric and photometric transformations.

Uploaded by

cks454436164
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views120 pages

Local Invariant Features

The document outlines a lecture on Local Invariant Features in Computer Vision and Image Processing, focusing on techniques such as Harris corner detection, blob detection with Laplacian of Gaussian (LoG), and the SIFT algorithm. It discusses the importance of local features for multi-view matching, including their desired properties like repeatability and distinctiveness. Additionally, it covers the mathematical foundations of corner detection and blob detection, emphasizing the role of geometric and photometric transformations.

Uploaded by

cks454436164
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 120

COMP 5523 Lecture 512

Computer Vision and Image Processing

Local Invariant Feature


Fall, 2024
Instructor: WU, Xiao-Ming
For internal use only,
please do not distribute!
Outline
• Local invariant features
• Harris corner detection
• Blob detection with LoG
• SIFT Algorithm
Local invariant feature detection
Previously: Features and filters

Transforming and describing images; colors, edges

Slide credit: Kristen Grauman


Now: Multiple views
Matching, invariant features, instance recognition

Lowe

Fei-Fei Li
Slide credit: Kristen Grauman
Panorama Stitching
• We have two images – how do we combine them?

Step 1: extract key points


Step 2: match key point features

Step 3: align images


Where are the corresponding points?

NASA Mars Rover images


Image matching

NASA Mars Rover images


with SIFT feature matches
Figure by Noah Snavely
https://www.youtube.com/watch?v=82jjFq303UY
https://www.youtube.com/watch?v=rDVW2_NgyAs
https://www.youtube.com/watch?v=KgsHoJYJ4S8&list=PL2zRqk16wsdqXEMpHrc4Qnb5rA1Cylrhx&index=12
Important tool for multiple views: Local features

Multi-view matching relies on local feature correspondences.

How to detect which local features to match?


Local features: main components
1) Detection: Identify the
interest points

2) Description: Extract vector x1 [ x1(1) ,  , xd(1) ]


feature descriptor surrounding
each interest point.

x 2 [ x1( 2) ,  , xd( 2 ) ]
3) Matching: Determine
correspondence between
descriptors in two views
Slide credit: Kristen Grauman
Local features: desired properties
• Repeatability
• The same feature can be found in several images despite geometric and
photometric transformations
• Distinctiveness
• Each feature has a distinctive description
• Compactness and efficiency
• Many fewer features than image pixels
• Locality
• A feature occupies a relatively small area of the image; robust to clutter
and occlusion
Geometric transformations

e.g., scale,
translation,
rotation
Photometric transformations

e.g., illumination
change, shadows,
highlights.

Figure from T. Tuytelaars ECCV 2006 tutorial


Goal: interest operator repeatability

• We want to detect (at least some of) the same points in both images.

No chance to find true matches!


• Yet we have to be able to run the detection procedure independently
per image, which can find some of the same points.
Goal: descriptor distinctiveness
• We want to be able to reliably determine which
point goes with which.

?
• Must provide some invariance to geometric and
photometric differences between the two views.
What is an interest point?
Some patches are not interesting
Are lines/edges interesting?
Pick a point in the image.
Find it again in the next image.

What type of feature would you select?


Pick a point in the image.
Find it again in the next image.

What type of feature would you select?


Pick a point in the image.
Find it again in the next image.

What type of feature would you select?


A corner!
What points would you choose?

Slide credit: Kristen Grauman


Detecting Corners
9300 Harris Corners Pkwy, Charlotte, NC

Slide credit: Kristen Grauman


Are Blobs Interesting?
Blob Detection
• For a Blob-like Feature to be
useful, we need to:
• Locate the blob
• Determine its size
• Determine its orientation
• Formulate a description or
signature that is independent of
size and orientation
https://www.youtube.com/watch?v=wcqbiHonfbo&list=PL2zRqk16wsdqXEMpHrc4Qnb5rA1Cylrhx&index=13
Outline
• Local invariant features
• Harris corner detection
• Blob detection with LoG
• SIFT Algorithm
Corners as distinctive interest points
• We should easily recognize the point by looking through a small window
• Shifting a window in any direction should give a large change in intensity

“flat” “edge”: “corner”:


region: no change significant
no change along the change in all
in all edge direction directions
directions
Slide credit: Alyosha Efros, Darya Frolova, Denis Simakov
Corners as distinctive interest points
Corner Detection: Derivation
Change in appearance of window W for the shift [u,v]:

I(x, y)
E(u, v)

E(3,2)
Corner Detection: Derivation
Change in appearance of window W for the shift [u,v]:

I(x, y)
E(u, v)

E(0,0)
Corner Detection: Derivation
Change in appearance of window W for the shift [u,v]:

We want to find out how this function behaves for small shifts
E(u, v)
Corner Detection: Derivation
First-order Taylor approximation for small motions [u, v]:
I ( x  u , y  v )  I ( x, y )  I x u  I y v
Let’s plug this into E(u,v):
I I I I
Notation: Ix  Iy  IxI y 
x y x y
∑ 𝐼 𝑥 𝐼 𝑦 =sum( .* )
𝑥,𝑦
array of x gradients array of y gradients
Corner Detection: Derivation
E(u,v) can be locally approximated by a quadratic surface:

Second moment matrix M


What does this matrix reveal?
First, consider an axis-aligned corner:
What does this matrix reveal?
First, consider an axis-aligned corner:

 I 2
I x I y   1 0 
M  
x
2 
 
 I x I y I y   0 2 

This means dominant gradient directions align with x or y axis


Look for locations where both λ’s are large.
If either λ is close to 0, then this is not corner-like.

What if we have a corner that is not aligned with the image axes?
What does this matrix reveal?
 1 0  T
Since M is symmetric, we have M X   X
 0 2 

Mxi i xi

The eigenvalues of M reveal the amount of intensity change in the


two principal orthogonal gradient directions in the window.
Note: please refer to Singular Value Decomposition (SVD) for detailed derivation.
Corner response function

“edge”: “corner”: “flat”


region
1 >> 2 1 and 2 are large, 1 and 2
2 >> 1  1 ~  2; are small;

Cornerness score
(other variants possible)
Corner Detection: Derivation
E(u,v) can be locally approximated by a quadratic surface:
E(u, v)

In which directions does this surface


have the fastest/slowest change?
Interpreting the second moment matrix
A horizontal “slice” of E(u, v) is given by the equation of an ellipse:

 u
[u v] M   const
 v
Interpreting the second moment matrix
Consider the axis-aligned case (gradients are either horizontal
or vertical):
  I x2 I Ix y

 x, y x, y   a 0
M  2 
 
 
IxI y I y   0 b 

Minor axis
 x, y x, y 
a-1/2
Major axis

b-1/2
Which surface indicates a good image feature?

What kind of image patch do these surfaces represent?


Which surface indicates a good image feature?

flat edge corner


‘line’ ‘dot’
Harris corner detector

1) Compute matrix M for each image window to get


their cornerness scores.
2) Find points whose surrounding window gave large
corner response (f > threshold)
3) Take the points of local maxima, i.e., perform non-
maximum suppression
Harris Detector: Steps
Harris Detector: Steps
Compute corner response f
Harris Detector: Steps
Find points with large corner response: f > threshold
Harris Detector: Steps
Take only the points of local maxima of f
Harris Detector: Steps
https://www.youtube.com/watch?v=Z_HwkG90Yvw&list=PL2zRqk16wsdqXEMpHrc4Qnb5rA1Cylrhx&index=7
Properties of the Harris corner detector
• Rotation invariant?  1 0  T
Yes M X   X
 0 2 

• Scale invariant? No

All points will be classified as edges Corner !


Automatic Scale Selection

How to find corresponding patch sizes?

K. Grauman, B. Leibe
Automatic Scale Selection
Intuition:
• Find scale that gives local maxima of some function f in both position and scale.

f f
Image 1 Image 2

s1 region size s2 region size


Automatic Scale Selection
• Function responses for increasing scale (scale signature)

f ( I i1im ( x,  )) f ( I i1im ( x,  ))


K. Grauman, B. Leibe
Automatic Scale Selection
• Function responses for increasing scale (scale signature)

f ( I i1im ( x,  )) f ( I i1im ( x,  ))


K. Grauman, B. Leibe
Automatic Scale Selection
• Function responses for increasing scale (scale signature)

f ( I i1im ( x,  )) f ( I i1im ( x,  ))


K. Grauman, B. Leibe
Automatic Scale Selection
• Function responses for increasing scale (scale signature)

f ( I i1im ( x,  )) f ( I i1im ( x,  ))


K. Grauman, B. Leibe
Automatic Scale Selection
• Function responses for increasing scale (scale signature)

f ( I i1im ( x,  )) f ( I i1im ( x,  ))


K. Grauman, B. Leibe
Automatic Scale Selection
• Function responses for increasing scale (scale signature)

f ( I i1im ( x,  )) f ( I i1im ( x,  ))


K. Grauman, B. Leibe
Outline
• Local invariant features
• Harris corner detection
• Blob detection with LoG
• SIFT Algorithm
Recall: First Derivative Filters
• Sharp changes in gray level of the input
image correspond to “peaks or valleys”
of the first-derivative of the input signal.

F(x)
F
’(x)

x
(1D example)
Slide from Robert Collins CSE486
Second-Derivative Filters

• Peaks or valleys of the first-derivative of


the input signal, correspond to “zero-
crossings” of the second-derivative of the
input signal.
F(x) F F’’(x
’(x) )

Slide from Robert Collins CSE486


1D Gaussian and Derivatives
2
 x
g(x)  e 2 2

1  x
2 x x2
g'(x)  2xe 2 2

 2 2
 2

2
2

x2
e 2
x34 1 
g''(x)  (   2 )e 2 2

Slide from Robert Collins CSE486


Laplacian of Gaussian
• Laplacian of Gaussian (LoG): Circularly symmetric operator for
blob detection in 2D

2 2
2  g  g
g 2  2
x y
Examples of Laplacian and LoG Filters
Laplacian of Gaussian Filter

Laplacian Filter

22 g 2 g
g 2  2
x y
Blob detection in 2D
• At what scale does the Laplacian achieve a maximum response to a
binary circle of radius r?
• To get maximum response, the zeros of the Laplacian have to be
aligned with the circle
2 2 2
• The Laplacian is given by (up to scale): ( x  y  2 ) e
2 2 2  ( x  y ) / 2

• Therefore, the maximum response occurs at  r / 2 .

circle

r 0

Laplacian

image
Example: LoG Extrema

LoG maxima
sigma = 2

minima

Slide from Robert Collins CSE486


LoG Extrema, Detail
maxima

LoG sigma = 2
Slide from Robert Collins CSE486
LoG Blob Finding
LoG filter extrema locates “blobs”
maxima = dark blobs on light background
minima = light blobs on dark
background

Scale of blob (size ; radius in pixels) is determined


by the sigma parameter of the LoG filter.

LoG sigma = 2 LoG sigma = 10


Slide from Robert Collins CSE486
Observe and Generalize
convolve result
with LoG

maxima
Slide from Robert Collins CSE486
Observe and Generalize

LoG looks a bit


like an eye.

and it responds
maximally in the
eye region!

Slide from Robert Collins CSE486


Blob detection in 2D: scale selection
2 2
• Laplacian-of-Gaussian = “blob” detector 2  g  g
g 2  2
x y
filter scales

img1 img2 img3


Blob detection in 2D
• We define the characteristic scale as the scale that
produces peak of Laplacian response

characteristic scale

Slide credit: Lana Lazebnik


Example
Original image at
¾ the size

Slide credit: Kristen Grauman


Scaled down image

Original image at
¾ the size

Original image

Slide credit: Kristen Grauman


Scaled down image

Original image

Slide credit: Kristen Grauman


Scaled down image

Original image

Slide credit: Kristen Grauman


Scaled down image

Original image

Slide credit: Kristen Grauman


Scaled down image

Original image

Slide credit: Kristen Grauman


Scaled down image

Original image

Slide credit: Kristen Grauman


http://www.cs.utexas.edu/~grauman/courses
/spring2011/slides/lecture14_localfeats.pdf
Scale invariant interest points
Interest points are local maxima in both position and scale.

s5

s4
scale

Lxx ( )  Lyy ( ) s3

s2

 List of
(x, y, σ)
s1
Squared filter response maps
Slide credit: Kristen Grauman
Scale-space blob detector: Example
Robert Collins
CSE486

Lindeberg: blobs are detected


as local extrema in space and
scale, within the LoG (or DoG)
scale-space volume.
T. Lindeberg. Feature detection with automatic scale selection. IJCV 1998.
Image credit: Lana Lazebnik
Technical detail
• We can approximate the Laplacian with a difference of Gaussians;
more efficient to implement.
L  2 Gxx ( x, y ,  )  G yy ( x, y,  ) 
(Laplacian)

DoG G ( x, y, k )  G ( x, y,  )
(Difference of Gaussians)
Difference of Gaussians Filtering
Outline
• Local invariant features
• Harris corner detection
• Blob detection with LoG
• SIFT Algorithm
SIFT
(Scale Invariant Feature Transform)
SIFT describes both a detector and descriptor
1. Multi-scale extrema
detection

2. Refine location and scale

3. Orientation assignment

4. Keypoint descriptor
Steps of SIFT algorithm

• 1. Determine approximate location and scale of salient


feature points (also called keypoints)
• 2. Refine their location and scale
• 3. Determine orientation(s) for each keypoint.
• 4. Determine descriptors for each keypoint.
Step 1: Approximate keypoint location

D(x, y,  ) 
Down-
sampling L(x, y, k )  L(x,
y,  )

L(x, y,  ) 
G(x, y,  ) * I (x, y)

Octave = doubling of σ0. Within an octave, the adjacent scales differ by a constant factor k. If an
octave contains s+1 images, then k = 2(1/s). The first image has scale σ0, the second image has
scale kσ0, the third image has scale k2σ0, and the last image has scale ksσ0. Such a sequence of
images convolved with Gaussians of increasing σ constitute a so-called scale space.
Scale = 0 Scale = 1 Scale = 4

Scale = 16 Scale = 64 Scale = 256

http://en.wikipedia.org/wiki/Scale_space
Selected if
1. Scale-space extrema detection larger/lower
than all 26
neighbors

Scale of Gaussian variance


• Detect maxima and minima
of difference-of-Gaussian in
scale space

• Each point is compared to its


R es a m pl e

B lu r
Subt r a ct

8 neighbors in the current


image and 9 neighbors each
in the scales above and
For each max or min found,
below output is the location and
the scale.
Keypoint computations
• Why do we look for extrema of the DoG
function?
 Maxima of the DoG indicate dark points (blobs)
on a bright background.
 Minima of the DoG indicate bright points (blobs) on
a dark background.
• Why do we look for extrema in a spatial as well as
scale sense?
 It helps us pick the “scale” associated with the
keypoint!
Initial detection of keypoints

http://upload.wikimedia.org/wikipedia/commons/4/44/Sift_keypoints_filtering.jpg

Step 2 Refine Keypoint Location & Scale


Removal of low-contrast keypoints

http://upload.wikimedia.org/wikipedia/commons/4/44/Sift_keypoints_filtering.jpg

Step 2 Refine Keypoint Location & Scale


Removal of high-contrast keypoints residing on edges

http://upload.wikimedia.org/wikipedia/commons/4/44/Sift_keypoints_filtering.jpg

Step 2 Refine Keypoint Location & Scale


Step 3: Assigning orientations
• Compute the gradient magnitudes and orientations in a small
window around the keypoint – at the appropriate scale.

35

30 Histogram of gradient
25 orientation – the bin-
20 counts are weighted by
15 gradient magnitudes
10 and a Gaussian
5 weighting function.
0
- 0 0.5 1 1.5 2 2.5 3 3.5
0.5
Step 3: Assigning orientations
• Assign the dominant orientation as the
orientation of the keypoint.
• In case of multiple peaks or histogram entries
more than 0.8 x peak, create a separate
descriptor for each orientation (they will all
have the same scale and location).
35

30
Histogram of gradient
25
orientation – the bin-
20
counts are weighted by
15
gradient magnitudes
10
and a Gaussian
5
weighting function.
0
- 0 0.5 1 1.5 2 2.5 3 3.5
0.5
Step 4. Keypoint Descriptors
• At this point, each keypoint has
• location
• scale
• orientation
• Next is to compute a descriptor for the local image region
about each keypoint that is
• highly distinctive
• invariant as possible to variations such as changes in viewpoint and
illumination

09/20/2025 106
Normalization
• Scale the window size based on the scale at which the point was found.
• Rotate patch according to its dominant gradient orientation (relative
orientation never change!) This puts the patches into a canonical
orientation.

09/20/2025 Image taken from slides by George Bebis (UNR). 107


Scale and Rotation Invariant
Step 4: Descriptors for each keypoint
• Use histograms to bin pixels within sub-patches according
to their orientation.

0 2p
gradients binned by orientation

Final descriptor =
concatenation of all
histograms, normalize

subdivided local patch histogram per grid cell


Step 4: Descriptors for each keypoint

• use the normalized region about the keypoint


• compute gradient magnitude and orientation at each point in the region
• weight them by a Gaussian window overlaid on the circle
• create an orientation histogram over the 4 X 4 subregions of the window
• 4 X 4 descriptors over 16 X 16 sample array were used in practice. 4 X 4
times 8 directions gives a vector of 128 values.

...

09/20/2025 110
Keypoint descriptor
Image Gradients SIFT descriptor
(4 x 4 pixel per cell, 4 x 4 cells) (16 cells x 8 directions = 128 dims)
SIFT descriptor [Lowe 2004]
• Extraordinarily robust matching technique
• Can handle changes in viewpoint
• Up to about 60 degree out of plane rotation
• Can handle significant changes in illumination
• Sometimes even day vs. night (below)
• Fast and efficient—can run in real time
• Lots of code available, e.g.
http://www.vlfeat.org/overview/sift.html

Steve Seitz
SIFT properties
• Invariant to
• Scale
• Rotation

• Partially invariant to
• Illumination changes
• Camera viewpoint
• Occlusion, clutter
Value of local (invariant) features
• Complexity reduction via selection of distinctive points
• Describe images, objects, parts without requiring
segmentation
• Local character means robustness to clutter, occlusion
• Robustness: similar descriptors in spite of noise, blur, etc.
Automatic mosaicing

Matthew Brown
http://matthewalunbrown.com/autostitch/autostitch.html
Recognition of specific objects, scenes

Scale Viewpoint

Lighting Occlusion

Slide credit: J. Sivic


https://www.youtube.com/watch?v=ram-jbLJjFg&list=PL2zRqk16wsdqXEMpHrc4Qnb5rA1Cylrhx&index=15
https://www.youtube.com/watch?v=IBcsS8_gPzE&list=PL2zRqk16wsdqXEMpHrc4Qnb5rA1Cylrhx&index=16
Summary
• Desirable properties for local features for
correspondence
• Basic matching pipeline
• Interest point detection
• Harris corner detector
• Laplacian of Gaussian and difference of Gaussians,
automatic scale selection
• SIFT descriptor
References
A lot of slides, images and contents of this
lecture are adapted from:
• CS 376: Computer Vision
http://vision.cs.utexas.edu/376-spring2018/#S
yllabus
• 16-385: Computer Vision
http://www.cs.cmu.edu/~16385/

You might also like