0% found this document useful (0 votes)
25 views105 pages

Lesson 14-Stereo Vision

The document outlines the course CS-867: Computer Vision, focusing on stereo vision techniques for depth estimation using binocular images. It discusses concepts such as correspondence, epipolar geometry, and camera calibration, emphasizing the importance of finding point correspondences between images to recover 3D structures. Additionally, it covers practical applications like anaglyphs, stereograms, and the use of essential matrices in stereo vision.

Uploaded by

Hadia Ramzan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views105 pages

Lesson 14-Stereo Vision

The document outlines the course CS-867: Computer Vision, focusing on stereo vision techniques for depth estimation using binocular images. It discusses concepts such as correspondence, epipolar geometry, and camera calibration, emphasizing the importance of finding point correspondences between images to recover 3D structures. Additionally, it covers practical applications like anaglyphs, stereograms, and the use of essential matrices in stereo vision.

Uploaded by

Hadia Ramzan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 105

• Course Code: CS-867

Computer Vision • Course Instructor: Dr. Momina Moetesum


• Email: [email protected]
Stereo Vision School of Electrical Engineering
and Computer Science

• Ability to infer information on the 3-D structure and distance of a scene


from two or more images taken from different viewpoints
• Humans use only two eyes/images
• Correspondence: What parts of left and right images are parts of same
object?
• Reconstruction: Given correspondences in left and right images, and
possibly information on stereo geometry, compute the 3D location and
structure of the observed objects

Week: 13 CS-867: Computer Vision 2


Stereo Vision School of Electrical Engineering
and Computer Science

• Humans can fuse pairs of images to get a sensation of depth

Week: 13 CS-867: Computer Vision 3


Problem Formulation (Ideal Case) School of Electrical Engineering
and Computer Science

• Given a calibrated binocular stereo pair, fuse it to produce a depth image

Week: 13 CS-867: Computer Vision 4


Anaglyphs School of Electrical Engineering
and Computer Science

• Anaglyphs provide a stereoscopic 3D effect when viewed with 2-color


glasses (each lens a chromatically opposite color, usually red and cyan).
• http://en.wikipedia.org/wiki/Anaglyph_image
• A free pair of red-blue stereo glasses can be ordered from Rainbow
Symphony Inc.
• http://www.rainbowsymphony.com/freestuff.html

Week: 13 CS-867: Computer Vision 5


Stereograms School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 6


3D Movies School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 7


Stereo Vision for Depth Estimation School of Electrical Engineering
and Computer Science

• Goal: Recover 3D structure from two or more images.


• The key is to find correspondences (i.e., which point in the left image
corresponds to which point in the right image).

• Once you know that, you can compute depth using triangulation.

Week: 13 CS-867: Computer Vision 8


Stereo Vision for Depth Estimation School of Electrical Engineering
and Computer Science

• You have two cameras separated by a


baseline (b).
• The cameras are aligned so their image
planes are parallel (rectified setup).
• If a point in space projects to (xL, y) in the
left image and (xR, y) in the right image,
then:
Disparity d=xL​−xR​

• The depth (Z) is given by:

Week: 13 CS-867: Computer Vision 9


Correspondence Problem School of Electrical Engineering
and Computer Science

• Hardest problem: Finding the right match for each pixel in the left image
in the right image.
• Assumption: In rectified images, corresponding points lie on the same
horizontal line i.e Epipolar Line.

Week: 13 CS-867: Computer Vision 10


Basic stereo matching algorithm School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 11


Epipolar Constraint School of Electrical Engineering
and Computer Science

• When two cameras observe the same


3D point in space, the images they
capture will feature the projection of
that point onto each camera's image
plane. These projections are connected
by a line called an epipolar line.
• The epipolar constraint states that the
corresponding points in the two images
must lie on their respective epipolar
lines.
• Reduces the search space when trying to
find correspondences between points in
two images.

Week: 13 CS-867: Computer Vision 12


Epipolar Geometry School of Electrical Engineering
and Computer Science

• Epipole: the center of projection of the right


camera as perceived in the left image.
• The Epipolar Plane: The 3D plane defined
by the two camera centers and the 3D point
in question. The epipoles lie on the epipolar
plane. Every scene point has its own
epipolar plane.

• Epipolar line: A line in the left image that


passes through the perceived position of the
point in the image and the right epipole.

Week: 13 CS-867: Computer Vision 13


Epipolar Geometry School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 14


Epipolar Geometry School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 15


Epipolar Geometry School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 16


Finding Epipolar Lines School of Electrical Engineering
and Computer Science

• The essential and fundamental matrices are 3x3 matrices that


“encode” the epipolar geometry of two views.
• Motivation: Given a point in one image, multiplying by the
essential/fundamental matrix will tell us which epipolar line to
search along in the second view.
• The Essential Matrix (E) is a fundamental concept in stereo
vision and epipolar geometry. It encodes the relative rotation
and translation between two calibrated cameras and relates
corresponding normalized image points in the two views.
Week: 13 CS-867: Computer Vision 17
Epipolar Lines School of Electrical Engineering
and Computer Science

• Simplest Case:
• Parallel Images

Slide from L. Lazebnik.

Week: 13 CS-867: Computer Vision 18


Epipolar Geometry-Parallel Images School of Electrical Engineering
and Computer Science

When this relationship holds:

Week: 13 CS-867: Computer Vision 19


Epipolar Geometry-Parallel Images School of Electrical Engineering
and Computer Science

When this relationship holds:

Then, we can:

Skew-
symmetric
matrix of t

Where, this should always be true:


Epipolar Constraint
Week: 13 CS-867: Computer Vision 20
Epipolar Geometry-Parallel Images School of Electrical Engineering
and Computer Science

Substituting Values in:

Essential Matrix Y-coordinates of corresponding


for Parallel Images point is always the same!

The image of a 3D point will always be on the same horizontal line.

Week: 13 CS-867: Computer Vision 21


Corresponding Search School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 22


Measuring similarity School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 23


Measuring similarity School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 24


Disparity Maps School of Electrical Engineering
and Computer Science

The resulting map shows the disparity at each pixel.


Brighter values = closer objects (high disparity).
Darker values = farther objects (low disparity).

Week: 13 CS-867: Computer Vision 25


Two View Stereo Vision School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 26


Assumptions School of Electrical Engineering
and Computer Science

• Ideal Case: We assume that:


• Cameras are Calibrated
• Images are Rectified
• Camera Calibration: Extrinsic [R, T] and Intrinsic Parameters [f, (cx,
cy)] are known.
• With calibrated cameras, we can convert pixel coordinates to
normalized camera coordinates (Projection Matrix P).
• This allows use of the Essential Matrix E (instead of the more general
Fundamental Matrix F), and simplifies 3D reconstruction.

Week: 13 CS-867: Computer Vision 27


Camera Geometry School of Electrical Engineering
and Computer Science

• A 3D point in the world projects onto the image plane


using similar triangles. This is a core concept in camera
geometry.
• Linear Camera Model
• Finding Projection Matrix
• Perspective Projection
• Finding relations between image plane, camera coordinates, and
scene world coordinates
• Camera Calibration
• Estimating values of Projection Matrix P (Intrinsic/Extrinsic)
Week: 13 CS-867: Computer Vision 28
Linear Camera Model School of Electrical Engineering
and Computer Science

• From 3D world coordinates to 2D image coordinates.


Conversion from World Coordinate Frame (Pw) to Camera Coordinate Frame (Pc ):

Pc​=RPw​+t

R: 3×3 rotation matrix (camera orientation)


t: 3×1 translation vector (camera position)
In homogeneous coordinates:

y
z
x

Week: 13 CS-867: Computer Vision 29


Linear Camera Model School of Electrical Engineering
and Computer Science

• Rotation

Week: 13 CS-867: Computer Vision 30


Linear Camera Model School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 31


Linear Camera Model School of Electrical Engineering
and Computer Science

Using Similar Triangles

Week: 13 CS-867: Computer Vision 32


Perspective Projection School of Electrical Engineering
and Computer Science

• A way to map 3D world points to 2D image points as they


would appear in a camera.
• This captures the foreshortening effect—objects further away
appear smaller.

Week: 13 CS-867: Computer Vision 33


Perspective Projection School of Electrical Engineering
and Computer Science

• A: Optical center of the camera.


• B: A point on the image plane directly in front of the
optical center (1 unit away, since AB=f=1).
• C: Vertical projection of image point on the image
plane.
• E: Real-world point whose projection is being
computed.
• D: Point along the optical axis aligned with E.
• AD: Depth of the 3D point E, denoted w.
• DE=v: Vertical coordinate of the 3D point in camera
coordinates. ABC ADE

• BC=y: The projected location on the image plane. Similar Triangles

Week: 13 CS-867: Computer Vision 34


Linear Camera Model School of Electrical Engineering
and Computer Science

This is where DPI (dots per inch) or


scaling factors come in:
u= mx​ ⋅xi, v= my​ ⋅yi

Here:
mx​: number of pixels per mm (in
horizontal direction)
my​ ​: number of pixels per mm (in
vertical direction)

So the final pixel location Ps=(u,v) gives


the pixel indices on the image sensor
grid.

Week: 13 CS-867: Computer Vision 35


Linear Camera Model School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 36


Linear Camera Model School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 37


Linear Camera Model School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 38


Linear Camera Model School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 39


Linear Camera Model School of Electrical Engineering
and Computer Science

• Translation (t) actually represents the position of the world origin in the
camera frame, not the camera’s position in the world frame directly.

Week: 13 CS-867: Computer Vision 40


Linear Camera Model School of Electrical Engineering
and Computer Science

• From world to camera coordinates: Rigid transformation (translation &


rotation)

Week: 13 CS-867: Computer Vision 41


Linear Camera Model School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 42


Camera Calibration School of Electrical Engineering
and Computer Science

• Goal: Obtain the parameters of the perspective projection.

• Why?
• Correct distortions and inaccuracies
• Improve accuracy of measurements
• Extract metric information from 2D images
• Determine camera params.: focal length, optical center
• Map the 3D world to 2D images accurately

Week: 13 CS-867: Computer Vision 43


Camera Calibration School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 44


Camera Calibration School of Electrical Engineering
and Computer Science

• Goal: Obtain the parameters of the perspective projection.

• There is a linear (up to homog. coordinates, “up to scale”) relationship


between image points and 3D point.
• How can we find the parameters of the projection matrix (P above)?
• If we had enough 2D-3D point correspondences - we can solve for all the parameters of
P.
• How should we obtain those?

Week: 13 CS-867: Computer Vision 45


Calibration Pattern School of Electrical Engineering
and Computer Science

• Step 1:
• Capture an image of an object of known
geometry.
• Need to make assumption on 3D - use a
known pattern - a calibration pattern.
• If we know the size: width, height (in mm)
of the squares - we have a lot of 3D points
to work with.

Week: 13 CS-867: Computer Vision 46


Calibration Pattern School of Electrical Engineering
and Computer Science

• Step 2:
• Find correspondences between 3D scene
points and 2D image - use a corner
detector (Harris) or key point detector
(SIFT) and try to connect them in lines.
• Presto - we have 2D & 3D matching points.
• But: The 3D points lie on a plane - they
have no Z coordinate! (e.g. all 3D points
are if the world frame = object frame).

Week: 13 CS-867: Computer Vision 47


Calibration Pattern School of Electrical Engineering
and Computer Science

• 2D & 3D matching points give us:

Week: 13 CS-867: Computer Vision 48


Solve for Projection Matrix School of Electrical Engineering
and Computer Science

Need to solve for P

AP=0

A [Known]

CS-867: Computer Vision 49


Week: 13
P [Unknown]
Solve for Projection Matrix School of Electrical Engineering
and Computer Science

• Homogeneous coordinates:

• We can determine P only up to scale.


• If we scale down the object and get camera closer
• Scaling projection matrix implies simultaneously scaling
the world and camera which does not change the image

Week: 13 CS-867: Computer Vision 50


Solve for Projection Matrix School of Electrical Engineering
and Computer Science

• Find P, such that:

• We can solve with LLSQ (eigen decomposition of


or SVD of), just like we did with homographies.
• Now we know

Week: 13 CS-867: Computer Vision 51


Solve for Projection Matrix School of Electrical Engineering
and Computer Science

• But we know:

• How do we break it apart?

Week: 13 CS-867: Computer Vision 52


Solve for Projection Matrix School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 53


In Reality!!! School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 54


Essential and Fundamental Matrix School of Electrical Engineering
and Computer Science

Cross Product as Matrix Multiplication

Week: 13 CS-867: Computer Vision 55


Essential Matrix School of Electrical Engineering
and Computer Science

Depends only on the EXTRINSIC Parameters (R & T)

Week: 13 CS-867: Computer Vision 56


Essential Matrix School of Electrical Engineering
and Computer Science

• Longuet-Higgins Equation:
• A film point can also be thought of as a
viewing ray.
• They are equivalent.
• (u,v) 2D film point
• (u,v,f) 3D point on film plane
• k(u,v,f) viewing ray into the scene
• k(X, Y, Z) ray through point P in the scene
[hint: k=f/Z, and we have u=fX/Z, v=fY/Z].

Week: 13 CS-867: Computer Vision 57


Essential Matrix School of Electrical Engineering
and Computer Science

• Longuet-Higgins Equation:
• A film point can also be thought of as a
viewing ray.
• They are equivalent.
• (u,v) 2D film point
• (u,v,f) 3D point on film plane
• k(u,v,f) viewing ray into the scene
• k(X, Y, Z) ray through point P in the scene
[hint: k=f/Z, and we have u=fX/Z, v=fY/Z].

Week: 13 CS-867: Computer Vision 58


Epipolar Lines School of Electrical Engineering
and Computer Science

• Intersection of image plane and epipolar plane


• Every scene point has two corresponding epipolar lines, one each on the
two image planes

Week: 13 CS-867: Computer Vision 59


Epipolar Lines School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 60


Epipolar Lines School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 61


Essential Matrix Summary School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 62


Essential vs Fundamental Matrices School of Electrical Engineering
and Computer Science

• Both matrices relate corresponding points in two images.


• In the case of the Fundamental matrix, the points are in pixel coordinates, and
it has more degree of freedom (7 DOF) meaning it requires more points to
compute as more parameters required.
• In the case of the Essential matrix, the points are in "normalized image
coordinates“, that have the origin at the optical center of the image, and the x
and y coordinates are normalized by the focal length in pixels: fx and fy
respectively, so that they are dimensionless.
• Essential matrix has 5 DOF.
• Both matrices are related as E=KTFK where K is the intrinsic matrix of the
camera (if both images are taken with same camera (special case)).
• In reality, E=(K’)TFK, where K’ and K are intrinsic matrices of both cameras
respectively.

Week: 13 CS-867: Computer Vision 63


Fundamental Matrix School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 64


Fundamental Matrix School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 65


Fundamental Matrix School of Electrical Engineering
and Computer Science

Depends on both the INTRINSIC and EXTRINSIC Parameters

Week: 13 CS-867: Computer Vision 66


Example School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 67


Example School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 68


Example School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 69


Example School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 70


Example School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 71


Example School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 72


Example School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 73


Example School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 74


Example School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 75


Image Rectification School of Electrical Engineering
and Computer Science

• The goal of image rectification is to transform images so that


corresponding points in different images lie along the same
scanlines, simplifying tasks such as stereo matching and 3D
reconstruction.
• Rectification involves applying a transformation to the original
image(s) such that straight lines in the scene appear straight in
the rectified image(s).
• This transformation is typically computed based on camera
calibration parameters and epipolar geometry.

Week: 13 CS-867: Computer Vision 76


Image Rectification School of Electrical Engineering
and Computer Science

We can rewarp them through rotation, mapping both image planes


to a common plane parallel to the baseline, this is called
rectification.

Week: 13 CS-867: Computer Vision 77


Image Rectification School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 78


Image Rectification School of Electrical Engineering
and Computer Science

What if the images are not in the required setup?

Week: 13 CS-867: Computer Vision 79


Stereo Rectification Algorithm School of Electrical Engineering
and Computer Science

• Compute:
• Essential Matrix
OR
• Fundamental Matrix
• Estimate Rectification
Transformation
• Apply Transformation
• Image warping
• Interpolation

Week: 13 CS-867: Computer Vision 80


Corresponding Problem School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 81


Corresponding Problem School of Electrical Engineering
and Computer Science

• Finding pairs of matched points in each image


that are projections of the same scene point
• Triangulation (depth estimation) depends on
the solution of correspondence problem
• Ambiguous correspondence between points
in the two images may lead to several
different consistent interpretations of the
scene

Week: 13 CS-867: Computer Vision 82


Issues with Correlation School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 83


Issues with Correlation School of Electrical Engineering
and Computer Science

CS-867:
WeekComputer Vision
15: Stereo Vision 84
Computer
Week: 13 Vision
Issues with Window-based Techniques School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 85


Issues with Window-based Techniques School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 86


Corresponding Search
(using Block Matching) School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 87


Issues with Block Matching School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 88


Issues with Block Matching School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 89


Better Techniques School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 90


Better Techniques School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 91


Better Techniques School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 92


Better Techniques School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 93


Better Techniques School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 94


Better Techniques School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 95


Better Techniques School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 96


Better Techniques School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 97


Comparison of Techniques School of Electrical Engineering
and Computer Science

Week: 13 CS-867: Computer Vision 98


Computer Vision Week 15: Stereo Vision 99
Computer Vision Week 15: Stereo Vision 100
Computer Vision Week 15: Stereo Vision 101
Computer Vision Week 15: Stereo Vision 102
Computer Vision Week 15: Stereo Vision 103
Computer Vision Week 15: Stereo Vision 104
Credits School of Electrical Engineering
and Computer Science

Some of the content of this lecture is taken from


following books and online sources with the
intention of clarifying various educational concepts.
Any copyright infringement is not intended.

Week: 13 CS-867: Computer Vision 105

You might also like