FBBMS602 Bilgisayarlı Görme
2023 Bahar Dönemi
Gülşah Tümüklü Özyer
1
Why is it Difficult?
What are the Challenges
2
Challenges 1: view point variation
Michelangelo 1475-1564
Challenges 2: illumination
slide credit: S. Ullman
Challenges 3: occlusion
Magritte, 1957
Challenges 4: scale
Challenges 5: deformation
Xu, Beihong 1943
Challenges 6: background clutter
Klimt, 1913
Challenges 7: intra-class variation
The Three Stages of Computer Vision
• low-level
image image
• mid-level
image features
• high-level
features analysis
10
Low-Level
sharpening
blurring
11
Low-Level
Canny
original image edge image
Mid-Level
ORT
data
structure
circular arcs and line segments 12
edge image
Mid-level
K-means
clustering
(followed by
connected
component
analysis)
original color image regions of homogeneous color
data
structure
13
Low- to High-Level
low-level
edge image
mid-level
consistent
high-level line clusters
Building Recognition
14
Recognition
Scale / orientation range to search over
Speed
Context
Course content
Image representatiın
Matrices, functions
Image file formats
Binary Image Analysis
Pixel and neighborhood
Masks and convolution
Counting and labeling
Morphological operations
16
Thresholding
Object Recognition conceps
Representation
Classification
Measures
Gray-level Image Analysis
Gray level mapping
Noise removal,
Smoothing
17
Color and shading
Color spaces
Shades
Texture
Texels, texture description
Texture measure
Segmentation
Clustering
Region Growing
Content Based Image retrieval
18
Imaging
and
Image Representation
Ch:2 Shapiro et al.
19
Classical Imaging Process
Light reaches surfaces
in 3D
Surfaces reflect
Sensor element
receives light energy
Intensity counts
Angles count
Material counts
What are radiance and irradiance? 20
Radiometry and Computer Vision*
• Radiometry is a branch of physics that deals with the
measurement of the flow and transfer of radiant energy.
• Radiance is the power of light that is emitted from a
unit surface area into some spatial angle;
the corresponding photometric term is brightness.
• Irradiance is the amount of energy that an image-
capturing device gets per unit of an efficient sensitive
area of the camera. Quantizing it gives image gray tones.
•From Sonka, Hlavac, and Boyle, Image Processing, Analysis, and
Machine Vision, ITP, 1999. 21
Sensors:
Image acquisition Devices
CCD (Charged Couple Device )
X-Ray Devices
Microwave Devices
UV Devices
Thermal Cameras
IR Devices
3-D scanners
22
CCD type camera:
Commonly used in industrial applications
Array of small fixed elements
Each element converts the
light energy to electric
charge
1x1 cm
Can add refracting elements
to get color in 2x2
neighborhoods
8-bit intensity common
23
Computer Vision
Algorithms
Main concern of CV is to develop Algorithms
24
LIDAR also senses surfaces
Single sensing
element scans
scene
Laser light reflected
off surface and
returned
Phase shift codes
distance
Brightness change
codes albedo
(surface
reflectance)
Stockman MSU/CSE Fall 2008 25
2.5D face image from Minolta Vivid 910
scanner
A rotating mirror scans a laser stripe
across the object. 320x240 rangels
obtained in about 2 seconds.Stockman MSU/CSE Fall 2008 26
3D scanning technology
3D image of voxels obtained
Usually computationally expensive
reconstruction of 3D from many 2D scans
(CAT computer-aided-tomography)
Stockman MSU/CSE Fall 2008 27
Magnetic Resonance Imaging
Sense density of
certain chemistry
S slices x R rows x C
columns
Volume element
(voxel) about 2mm
per side
At left is shaded 2D
image created by
“volume rendering” a
3D volume: darkness
codes depth
Stockman MSU/CSE Fall 2008 28
Single slice through human head
MRIs are computed
structures, computed
from many views.
At left is MRA
(angiograph), which
shows blood flow.
CAT scans are
computed in much the
same manner from X-
ray transmission data.
Stockman MSU/CSE Fall 2008 29
Problems in Image Acquisition
30
31
Human eye as a spherical camera
75-150 millionRods sense intensity
6-7 million Cones sense color
Fovea has tightly packed area,
more cones
Periphery has more rods
Focal length is about 20mm
Pupil/iris controls light entry
• Eye scans, or saccades to image
details on fovea
•100M sensing cells funnel to 1M
optic nerve connections to the brain
Stockman MSU/CSE Fall 2008 32
RODES AND CONES
Cones
Image Formation
Problems in HVS Mach Band Effect
Contrast
Illusions
Images: 2D projections of 3D
The 3D world has color, texture, surfaces,
volumes, light sources, temperature, reflectance,
…
A 2D image is a projection of a scene from a
specific viewpoint.
40
Digital Images form arrays
Digitizing- SAmpling
Quantization
Digital Image: Sampled and quantized
Sampling at different resolution
Sampling
Quantization
What is the appropriate sampling and quantization rates?
Resolution
• resolution: precision of the sensor
• nominal resolution: size of a single pixel in scene
coordinates (ie. meters, mm)
• common use of resolution: num_rows X num_cols
(ie. 515 x 480)
• field of view (FOV): size of the scene a sensor can sense
49
50
Images as Functions
•A gray-tone image is a function:
g(x,y) = val or f(row, col) = val
• A color image is just three functions or a
vector-valued function:
f(row,col) =(r(row,col), g(row,col), b(row,col))
•Multi-spectral Image:
f(row,col) =(f1(row,col), f2(row,col),…, fn(row,col))
51
Gray-tone Image as Function
52
Image vs Matrix
There are many different file formats. 53
Digital Image Terminology:
0 0 0 0 1 0 0
pixel (with value 94)
0 0 1 1 1 0 0
0 1 95 96 94 93 92 its 3x3 neighborhood
0 0 92 93 93 92 92
0 0 93 93 94 92 93 region of medium
0 1 92 93 93 93 93 intensity
0 0 94 95 95 96 95
resolution (7x7)
• binary image
• gray-scale (or gray-tone) image
• color image
• multi-spectral image
• range image
• labeled image 54
Image File Formats
Portable Gray Map (PGM) older form
GIF was early commercial version
JPEG (JPG) is modern version
MPEG for motion
Many others exist: header plus data
Do they handle color?
Do they provide for compression?
Are there good packages that use them
or at least convert between them?
55
Commpression:
Reduce the redundancy
1. Lossy
2. Lossless
56
Run Coding
Row1 0001001000000
Row2 0001111000000
Row3 0001001000000
Code 1: 3(0)1(1)2(0)1(1)6(0)
Or
Code2: (4,4)(7,7)
57
PGM image with ASCII info.
P2 means ASCII
gray
Comments
W=16; H=8
192 is max
intensity
Can be made
with editor
Large images
are usually not
stored as ASCII
58
PBM/PGM/PPM Codes
• P1: ascii binary (PBM)
• P2: ascii grayscale (PGM)
• P3: ascii color (PPM)
• P4: byte binary (PBM)
• P5: byte grayscale (PGM)
• P6: byte color (PPM)
59
JPG current popular form
Public standard
Allows for image compression; often 10:1 or
30:1 are easily possible
8x8 intensity regions are fit with basis of cosines
Error in cosine fit coded as well
Parameters then compressed with Huffman
coding
Common for most digital cameras
60
61
From 3D Scenes to 2D Images
• Object
• World
• Camera
• Real Image
• Pixel Image
62
Binary Image Analysis
63
Binary image analysis
• consists of a set of image analysis operations
that are used to produce or process binary
images, usually images of 0’s and 1’s.
0 represents the background
1 represents the foreground
00010010001000
00011110001000
00010010001000
64
Binary Image Analysis
is used in a number of
practical applications, e.g.
• part inspection
• riveting
• fish counting
• document processing
65
What kinds of operations?
Separate objects from background
and from one another
Aggregate pixels for each object
Compute features for each object
66