0% found this document useful (0 votes)
13 views

Local_Features___Harris_Corner_Detection

Uploaded by

5bnvpv9db4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Local_Features___Harris_Corner_Detection

Uploaded by

5bnvpv9db4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Local Features - Harris Corner detection

Clara Gonçalves
March 2024

1 Feature Extraction
In the process of feature extraction, the aim is to identify specific key points such as corners and blobs1
within an image.

Motivation
The motivation behind feature extraction lies in various applications, as examples are automatic panorama
creation and panorama stitching, where multiple images are combined into a single cohesive panorama.
This process typically involves the following steps:

1. Feature Extraction: Identifying distinctive features in each image.

2. Feature Matching: Finding corresponding features in different images.


3. Image Alignment: Adjusting the images to align the matching features.
4. Blending Images: Seamlessly blending aligned images to create a cohesive panorama.

Achieving successful panorama stitching requires accurate estimation of geometric transformations to


align the images effectively.

Global Features vs Local Features


Global Features: These features characterize the entire image and provide a holistic representation.
Examples include shape matrices, invariant moments, Histograms of Oriented Gradients (HOG), and co-
HOG.

Local Features: In contrast, local features highlight specific points of interest within an image. They are
crucial for tasks such as object recognition.
While global features are often utilized for tasks like object detection (determining the presence or absence
of an object), local features excel in object recognition by identifying and matching specific keypoints across
images.

1.1 Applications
Feature extraction finds widespread utility in various domains, including:
- Visual SLAM (Simultaneous Localization and Mapping) and Tracking in AR/VR environments,
where it aids in constructing or updating a map of an unknown environment while simultaneously tracking
the agent’s position within it.
- 3D Reconstruction, facilitating the creation of three-dimensional models from images by identifying
and correlating relevant features.
1 Blobs refer to regions in an image with similar intensity or color, often indicating objects or structures of interest.

1
- Augmented Reality, enhancing real-world environments with digital information by accurately de-
tecting and matching features between the physical and virtual worlds.
- Image Matching, particularly in scenarios where images may depict the same scene from different
angles, positions, or zoom levels. Feature extraction enables the identification of meaningful features within
images, facilitating tasks like image registration and scene recognition.
These are just a few examples, as feature extraction finds application in numerous other fields and tasks.

Advantages of Local Features Local features offer several advantages over global features:

• Locality: Local features focus on specific regions within an image, making them robust to occlusion
and clutter.

• Quantity: Hundreds or even thousands of local features can be detected in a single image, providing
rich information for analysis.
• Distinctiveness: Local features possess unique characteristics that allow them to differentiate between
a large database of objects.

• Efficiency: Real-time performance can be achieved with local feature extraction algorithms.

Invariant Local Features


Invariant local features aim to find keypoints that remain consistent under various transformations, including:

• Geometric Invariance: Such features are invariant to translation, rotation, and scale changes.
• Photometric Invariance: They also exhibit invariance to changes in brightness, exposure, and other
photometric transformations.

The objective is to identify regions within an image that are distinctive or unique, leading to unambiguous
matches across multiple images. Defining what constitutes ”unusual” features is a critical aspect of local
feature extraction.

2 Harris Corner Detector


The Harris Corner Detector offers a fundamental method for identifying corners within an image. Its
underlying principle involves analyzing local image neighborhoods instead of individual pixels to pinpoint
corners effectively.
The key characteristics of corners that the Harris Corner Detector exploits include:

• Easily Recognizable Points: Corners should be easily distinguishable when observing a small win-
dow around them.
• Intensity Variation: Shifting a window in any direction around a corner should result in a significant
change in intensity.

Based on these observations, the detector defines a local measure of uniqueness for different regions of
an image:

• Flat Region: Uniform areas exhibit no change in intensity in all directions.


• Edge Region: Intensity remains constant along the edge directions.
• Corner Region: Significant intensity changes occur in all directions.

2
Mathematics behind the Harris Corner Detection
The Harris Corner Detection algorithm evaluates the change in intensity within a small window when shifted
by a small amount. This change is quantified using the Sum of Squared Differences (SSD) error function,
defined as:
X
E(u, v) = w(x, y)[I(x + u, y + v) − I(x, y)]2
x,y

However, computing E(u, v) for every pixel and every offset (u, v) is computationally inefficient.

Small Motion Assumption


To simplify computation, the Harris Corner Detector relies on the small motion assumption. By Taylor
series expansion of the image I, we get:

I(x + u, y + v) ≈ I(x, y) + uIx (x, y) + vIy (x, y)


For small motions (u, v), this equation simplifies to:

I(x + u, y + v) ≈ I(x, y) + uIx + vIy


Substituting this approximation into the SSD error function, we obtain:
X
E(u, v) = w(x, y)[uIx + vIy ]2
x,y

Second Moment Matrix


By defining the derivatives Ix and Iy , the SSD error function E(u, v) we have a for small shifts [u,v] we have
a bilinear aproximation as a quadratic error function:
 
  u
E(u, v) ≈ u v M
v
where M is the second moment matrix (2x2 matrix computed from image derivatives):
 2 
X Ix Ix Iy
M= w(x, y)
Ix Iy Iy2
x,y

In the presence of edges:

• In a horizontal edge, only the Iy derivative is non-zero.


• In a vertical edge, only the Ix derivative is non-zero.

Covariance Matrix of the partial derivatives of the image intensity with respect to the x and y axes.
P P 
I I I I
Pp∈P x x Pp∈P x y
p∈P Iy Ix p∈P Iy Iy

By computing the gradient covariance matrix we are fitting quadratic to the gradient over a small image
region.

3
Eigenvalues and Eigenvectors
Eigenvectors of a matrix are vectors e that satisfy:

M e = λe
(M − λI)e = 0
where: λ represents the eigenvalue and e is the eigenvector, indicating that the scalar λ is the eigenvalue
corresponding to e.
These values are computed through the following steps:
1. Compute the determinant of M − λI, which will yield a polynomial.

2. Find the roots of the polynomial det (M − λI) = 0; these calculations return the eigenvalues.
3. For each eigenvalue, solve the equation (M − λI)e = 0 to find the eigenvectors.
 
−1 λ1 0
We can visualize these vectors as an ellipse. Since M is symmetric, we have R R, which gives
0 λ2
us an ellipse equation in quadratic form as follows:
 
  u
u v M = const
v

This ellipse’s axes lengths are determined by the eigenvalues and its orientation by R (composed of
eigenvectors e1 and e2 ).
Now, let’s interpret the second matrix with examples:

2.1 Corner Detection


 
  u
Using the formula u v M = const, we substitute M with the following matrix:
v
 
A B
B C
and derive:

HXmax = λmax xmax


HXmin = λmin xmin
This means the eigenvalues and eigenvectors of H:

• Define shift directions with the smallest and largest change in error.
• xmax is the direction of the largest increase of E.
• λmax is the amount of increase in direction xmax .
• xmin is the smallest increase in E.

• λmin is the amount of increase in direction xmin .

4
This relationship between the eigenvalues λ can be visualize in the following image:

- If λ1 and λ2 are similar and small, it indicates a flat area. - If λ1 is much greater than λ2 or vice versa,
it represents a vertical or horizontal edge. - If both values λ1 and λ2 are big and close, it denotes a corner.
A threshold needs to be set for each region.
In summary, in corner detection:
1. Compute the gradient at each point in the image.
2. For each pixel:
• Create H matrix from nearby gradient values.
• Compute eigenvalues.
• Find points with large response (λmin > threshold).
3. Choose those points where λmin is a local maximum as features.

2.2 The Harris Operator


λmin is a variant of the Haris operator for feature detection it help us with threshold in order to detect if it
fits a corner.
λ1 λ2 determinat(H)
f= (1)
λ1 + λ2 trace(H)
where: the trace is the sum of the diagonals , i.e, trace(H) = h1 1 + h2 2
called the Harris corner detector or harris operator Lots of other detectors, this is the most popular
Harris detector steps: 1. Compute gaussian derivatives at each pixel (i.e sobel filter) 2. Compute second
moment matrix H in a gaussian window around each pixel 3. Compute corner response function f for R
4. Threshold for R 5. Find local maxima of response function(non maximum suppresion within a certain
window) - non maximum suppresion takes only the points of local maxima of R .
give other examples like harris and stephens kanade and tomasi and nobel. per exaple

Properties of Harris Corner Detector


The Harris corner detector exhibits robustness to various transformations, ensuring that features deemed
significant in one image remain significant even after transformations are applied. This robustness can be
categorized into two main types:

5
Invariance and Equivariance Invariance refers to the property where corner locations remain unchanged
under certain transformations, specifically photometric ones. Equivariance, on the other hand, implies that
if an image undergoes geometric transformations, the detected features should correspond to those in the
original image.

• Geometric Invariance: The detector maintains invariance to geometric transformations such as


rotation and scaling. Corner locations remain consistent even when the image is transformed.
• Photometric Invariance: The detector is also invariant to photometric changes, such as alterations
in intensity. This means that even if the intensity of the image varies, the detected corners remain
unchanged.
• Equivariance with Geometric Transformations: When the image undergoes transformations like
rotation or scaling, the corner locations adjust accordingly. If two versions of the same image are
transformed, the features are detected in corresponding locations.

Impact of Transformations

• Image Translation: Derivatives and window functions employed in the detector are equivariant,
meaning that different positions in the image with the same derivatives are identified effectively. Hence,
corner locations exhibit equivariance with translation.
• Image Rotation: Despite image rotation, the detector aims to maintain robustness. Although the
second-moment ellipse rotates, its essential characteristics, such as eigenvalues, remain unchanged.
Thus, corner locations are equivariant with image rotation.
• Affine Intensity Change: The detector primarily relies on derivatives, making it partially invariant
to affine intensity changes like shifts or scaling in intensity values. However, due to a fixed intensity
threshold on local maxima, it only exhibits partial invariance to multiplicative intensity changes.
• Scaling: Unfortunately, the detector is not robust to scaling transformations. It neither maintains
invariance nor equivariance with scaling, which can pose challenges in scenarios involving significant
scale variations.
While the Harris corner detector offers robustness to various transformations, it’s essential to note its
limitations, particularly regarding scaling, where its performance might be compromised.

3 Scale Invariant Detection


In computer vision, detecting objects or features regardless of their size is crucial for various applications. One
common technique is blob detection, where regions of interest are identified irrespective of their scale. This
section discusses the scale-invariant detection approach, emphasizing the use of scale-normalized Laplacian
of Gaussian (LoG) filters.

3.1 Overview
In scale-invariant detection, the goal is to identify features, such as corners or blobs, that remain consistent
across images despite changes in scale. This is particularly useful for tasks like object recognition and
tracking, where objects may appear at different scales due to perspective changes or varying distances from
the camera.

3.2 Key Concepts


• Scale Normalized Response Function: At each pixel location (x, y), a scale-invariant response
function is computed over a range of scales (σ). The scale-space location (x, y, σ) where this function
reaches a local maximum corresponds to a key feature.

6
• Laplacian of Gaussian (LoG) Filter: The LoG filter is used to detect regions with intensity
variations, such as edges or corners. It highlights areas of rapid intensity change, making it suitable
for detecting key features.

3.3 Scale Selection


To find the characteristic scale of a feature, the image is convolved with scale-normalized LoG filters at various
scales. The scale that produces the maximum response indicates the optimal scale for feature detection. This
process ensures that features are detected consistently across different scales.

Figure 1: Response of Scale-Normalized LoG Filter at Different Scales

In cases where the scale doesn’t match perfectly, additional techniques such as zero-crossing detection
can be employed to refine feature detection.

3.4 Implementation
To implement scale-invariant detection efficiently, a Gaussian pyramid is often used. This involves creating
multiple scaled versions of the image and applying feature detection algorithms independently at each level.
The following steps outline the implementation:

1. Feature Detection: Compute feature responses (e.g., Harris-Laplacian) at each level of the Gaussian
pyramid.
2. Local Maxima Detection: Identify local maxima in the 3D position-scale space to determine key
feature locations.

3. Scale and Feature Matching: Save the scale and local feature descriptors (e.g., (x, y, σ)) for further
processing, such as feature matching between images.

3.5 Feature Components


In feature-based computer vision tasks, such as object recognition or tracking, three main components are
involved:

1. Detection: Identifying points of interest (e.g., corners, blobs) in the image.


2. Description: Extracting feature descriptors that capture local image information around each point
of interest.
3. Matching: Establishing correspondences between feature descriptors in different images for tasks like
object matching or motion tracking.

7
3.6 Feature Tracking
Feature tracking involves following identified features over time or across frames in a video sequence. It
requires robust feature detection and matching techniques to ensure accurate tracking despite variations in
appearance due to motion or changes in viewpoint.

You might also like