Lecture25 Spring2018
Lecture25 Spring2018
recognition
Tues April 23
Kristen Grauman
UT Austin
Last time
• Supervised classification continued
• Nearest neighbors
• Support vector machines
• HoG pedestrians example
• Kernels
• Multi-class from binary classifiers
Recalll: Examples of kernel functions
Linear: T
K ( xi , x j ) xi x j
2
xi x j
Gaussian RBF: K ( xi ,x j ) exp( )
2 2
Histogram intersection:
K ( xi , x j ) min( xi (k ), x j (k ))
k
(m=num pts)
descriptor
space
Histogram intersection
counts number of possible
matches at a given
partitioning.
Slide credit: Kristen Grauman
Pyramid match
optimal partial
matching
The Pyramid Match Kernel: Efficient
Learning with Sets of Features. K.
Grauman and T. Darrell. Journal of
Machine Learning Research (JMLR), 8
(Apr): 725--760, 2007.
BoW Issue:
No spatial layout preserved!
Confusion table
Today
• (Deep) Neural networks
• Convolutional neural networks
Traditional Image Categorization: Training phase
Training Training
Images
Training Labels
Training Training
Images
Training Labels
Testing
Image Trained Prediction
Features Classifier Outdoor
Test Image Slide credit: Jia-Bin Huang
Features have been key
SIFT [Lowe IJCV 04] HOG [Dalal and Triggs CVPR 05]
Image/video
Image/Video Labels
Simple
Pixels Layer 1 Layer 2 Layer 3
Classifier
3rd layer
Input data “Objects”
2nd layer
“Object parts”
1st layer
“Edges”
Lee et al., ICML
2009; CACM 2011
Pixels
Slide: R. Fergus
Biological neuron and Perceptrons
David Hubel's Eye, Brain, and Vision Slide credit: Jia-Bin Huang
Hubel/Wiesel Architecture and Multi-layer Neural Network
Procedure:
Gradient descent / hill climbing
Practical considerations:
Can be seen as learning the features
Large number of neurons
Danger for overfitting
Hill-climbing procedure can get stuck in bad local
optima
Approximation by Superpositions of Sigmoidal Function,1989 Slide credit: Pieter Abeel and Dan Klein
Today
• (Deep) Neural networks
• Convolutional neural networks
Significant recent impact on the field
.
.
.
Normalization
Spatial pooling
Non-linearity
Convolution
(Learned)
Normalization
Spatial pooling
Non-linearity
.
.
Convolution .
(Learned)
Spatial pooling
Non-linearity
Convolution
(Learned)
Normalization
Max pooling
Spatial pooling
Non-linearity
Normalization
Spatial pooling
Non-linearity
Convolution
(Learned)
Convolution/pool
Label Convolution/pool
Classifier Convolution/pool
Pooling Convolution/pool
Image Image
Jia-Bin Huang and Derek Hoiem, UIUC
SIFT Descriptor
Lowe [IJCV 2004]
Image
Pixels Apply
oriented filters
Spatial pool
(Sum)
Feature
Normalize to unit Vector
length
slide credit: R. Fergus
Spatial Pyramid Matching
Lazebnik,
Schmid,
SIFT Filter with Ponce
Features Visual Words [CVPR 2006]
Max
Multi-scale
spatial pool Classifier
(Sum)
slide credit: R. Fergus
Visualizing what was learned
• What do the learned filters look like?
AlexNet
http://image-net.org/challenges/talks/2016/ILSVRC2016_10_09_clsloc.pdf
Industry Deployment
• Used in Facebook, Google, Microsoft
• Image Recognition, Speech Recognition, ….
• Fast at test time