0% found this document useful (0 votes)

6 views

Lecture25 Spring2018

Uploaded by

Balaji Nadipilli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Lecture25 Spring2018

Uploaded by

Balaji Nadipilli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 54

Deep learning for visual

recognition
Tues April 23
Kristen Grauman
UT Austin
Last time
• Supervised classification continued
• Nearest neighbors
• Support vector machines
• HoG pedestrians example
• Kernels
• Multi-class from binary classifiers
Recalll: Examples of kernel functions
 Linear: T
K ( xi , x j )  xi x j

2
xi  x j
 Gaussian RBF: K ( xi ,x j )  exp( )
2 2

 Histogram intersection:
K ( xi , x j )   min( xi (k ), x j (k ))
k

• Kernels go beyond vector space data

• Kernels also exist for “structured” input spaces like
sets, graphs, trees…
Discriminative classification with
sets of features?
• Each instance is unordered set of vectors
• Varying number of vectors per instance

Slide credit: Kristen Grauman

Partially matching sets of features

Optimal match: O(m3)

Greedy match: O(m2 log m)
Pyramid match: O(m)

(m=num pts)

We introduce an approximate matching kernel that

makes it practical to compare large sets of features
based on their partial correspondences.

[Previous work: Indyk & Thaper, Bartal, Charikar, Agarwal &

Varadarajan, …]
Slide credit: Kristen Grauman
Pyramid match: main idea

Feature space partitions

serve to “match” the local
descriptors within
successively wider regions.

descriptor
space

Slide credit: Kristen Grauman

Pyramid match: main idea

Histogram intersection
counts number of possible
matches at a given
partitioning.
Slide credit: Kristen Grauman
Pyramid match

measures number of newly matched

difficulty of a pairs at level
match at level

• For similarity, weights inversely proportional to bin size

(or may be learned)
• Normalize these kernel values to avoid favoring large sets

[Grauman & Darrell, ICCV 2005] Slide credit: Kristen Grauman

Pyramid match
Optimal match: O(m3)
Pyramid match: O(mL)

optimal partial
matching
The Pyramid Match Kernel: Efficient
Learning with Sets of Features. K.
Grauman and T. Darrell. Journal of
Machine Learning Research (JMLR), 8
(Apr): 725--760, 2007.
BoW Issue:
No spatial layout preserved!

Too much? Too little?

Slide credit: Kristen Grauman

Spatial pyramid match
• Make a pyramid of bag-of-words histograms.
• Provides some loose (global) spatial layout
information

[Lazebnik, Schmid & Ponce, CVPR 2006]

Spatial pyramid match
• Make a pyramid of bag-of-words histograms.
• Provides some loose (global) spatial layout
information

Sum over PMKs

computed in image
coordinate space,
one per word.

[Lazebnik, Schmid & Ponce, CVPR 2006]

Spatial pyramid match
• Can capture scene categories well---texture-like patterns
but with some variability in the positions of all the local
pieces.
Spatial pyramid match
• Can capture scene categories well---texture-like patterns
but with some variability in the positions of all the local
pieces.
• Sensitive to global shifts of the view

Confusion table
Today
• (Deep) Neural networks
• Convolutional neural networks
Traditional Image Categorization: Training phase

Training Training
Images
Training Labels

Image Classifier Trained

Features Training Classifier

Slide credit: Jia-Bin Huang

Traditional Image Categorization: Testing phase

Training Training
Images
Training Labels

Image Classifier Trained

Features Training Classifier

Testing
Image Trained Prediction
Features Classifier Outdoor
Test Image Slide credit: Jia-Bin Huang
Features have been key

SIFT [Lowe IJCV 04] HOG [Dalal and Triggs CVPR 05]

SPM [Lazebnik et al. CVPR 06] Textons

and many others:

SURF, MSER, LBP, Color-SIFT, Color histogram, GLOH, …..
Learning a Hierarchy of Feature Extractors

• Each layer of hierarchy extracts features from output

of previous layer
• All the way from pixels  classifier
• Layers have the (nearly) same structure

Image/video
Image/Video Labels
Simple
Pixels Layer 1 Layer 2 Layer 3
Classifier

• Train all layers jointly

Slide: Rob Fergus
Learning Feature Hierarchy
Goal: Learn useful higher-level features from images
Feature representation

3rd layer
Input data “Objects”

2nd layer
“Object parts”

1st layer
“Edges”
Lee et al., ICML
2009; CACM 2011
Pixels

Slide: Rob Fergus

Learning Feature Hierarchy
• Better performance

• Other domains (unclear how to hand engineer):

– Kinect
– Video
– Multi spectral

• Feature computation time

– Dozens of features regularly used [e.g., MKL]
– Getting prohibitive for large datasets (10’s sec /image)

Slide: R. Fergus
Biological neuron and Perceptrons

A biological neuron An artificial neuron (Perceptron)

- a linear classifier

Slide credit: Jia-Bin Huang

Simple, Complex and Hypercomplex cells

David H. Hubel and Torsten Wiesel

Suggested a hierarchy of feature detectors

in the visual cortex, with higher level features
responding to patterns of activation in lower
level cells, and propagating activation
upwards to still higher level cells.

David Hubel's Eye, Brain, and Vision Slide credit: Jia-Bin Huang
Hubel/Wiesel Architecture and Multi-layer Neural Network

Hubel and Weisel’s architecture Multi-layer Neural Network

- A non-linear classifier

Slide credit: Jia-Bin Huang

Neuron: Linear Perceptron
 Inputs are feature values
 Each feature has a weight
 Sum is the activation

 If the activation is:

 Positive, output +1
 Negative, output -1

Slide credit: Pieter Abeel and Dan Klein

Two-layer perceptron network

Slide credit: Pieter Abeel and Dan Klein

Two-layer perceptron network

Slide credit: Pieter Abeel and Dan Klein

Two-layer perceptron network

Slide credit: Pieter Abeel and Dan Klein

Learning w
 Training examples

 Objective: a misclassification loss

 Procedure:
 Gradient descent / hill climbing

Slide credit: Pieter Abeel and Dan Klein

Hill climbing
 Simple, general idea:
 Start wherever
 Repeat: move to the best
neighboring state
 If no neighbors better than
current, quit
 Neighbors = small
perturbations of w
 What’s bad?
 Complete?
 Optimal?

Slide credit: Pieter Abeel and Dan Klein

Two-layer perceptron network

Slide credit: Pieter Abeel and Dan Klein

Two-layer perceptron network

Slide credit: Pieter Abeel and Dan Klein

Two-layer neural network

Slide credit: Pieter Abeel and Dan Klein

Neural network properties
 Theorem (Universal function approximators): A
two-layer network with a sufficient number of
neurons can approximate any continuous
function to any desired accuracy

 Practical considerations:
 Can be seen as learning the features
 Large number of neurons
 Danger for overfitting
 Hill-climbing procedure can get stuck in bad local
optima
Approximation by Superpositions of Sigmoidal Function,1989 Slide credit: Pieter Abeel and Dan Klein
Today
• (Deep) Neural networks
• Convolutional neural networks
Significant recent impact on the field

Big labeled Deep learning

datasets
ImageNet top-5 error (%)
30
25
GPU technology 20
15
10
5
0
1 2 3 4 5 6
Slide credit: Dinesh Jayaraman
Convolutional Neural Networks
(CNN, ConvNet, DCN)

• CNN = a multi-layer neural network with

– Local connectivity:
• Neurons in a layer are only connected to a small region
of the layer before it
– Share weight parameters across spatial positions:
• Learning shift-invariant filter kernels

Image credit: A. Karpathy

Jia-Bin Huang and Derek Hoiem, UIUC
LeNet [LeCun et al. 1998]

Gradient-based learning applied to document

recognition [LeCun, Bottou, Bengio, Haffner 1998] LeNet-1 from 1993
Jia-Bin Huang and Derek Hoiem, UIUC
What is a Convolution?
• Weighted moving sum

.
.
.

Input Feature Activation Map

slide credit: S. Lazebnik
Convolutional Neural Networks
Feature maps

Normalization

Spatial pooling

Non-linearity

Convolution
(Learned)

Input Image slide credit: S. Lazebnik

Convolutional Neural Networks
Feature maps

Normalization

Spatial pooling

Non-linearity
.
.
Convolution .
(Learned)

Input Feature Map

Input Image slide credit: S. Lazebnik
Convolutional Neural Networks
Feature maps

Normalization Rectified Linear Unit (ReLU)

Spatial pooling

Non-linearity

Convolution
(Learned)

Input Image slide credit: S. Lazebnik

Convolutional Neural Networks
Feature maps

Normalization
Max pooling

Spatial pooling

Non-linearity

Convolution Max-pooling: a non-linear down-sampling

(Learned)
Provide translation invariance
Input Image slide credit: S. Lazebnik
Convolutional Neural Networks
Feature maps

Normalization

Spatial pooling

Non-linearity

Convolution
(Learned)

Input Image slide credit: S. Lazebnik

Engineered vs. learned features
Label
Convolutional filters are trained in a Dense
supervised manner by back-propagating
Dense
classification error
Dense

Convolution/pool

Label Convolution/pool

Classifier Convolution/pool

Pooling Convolution/pool

Feature extraction Convolution/pool

Image Image
Jia-Bin Huang and Derek Hoiem, UIUC
SIFT Descriptor
Lowe [IJCV 2004]
Image
Pixels Apply
oriented filters

Spatial pool
(Sum)

Feature
Normalize to unit Vector
length
slide credit: R. Fergus
Spatial Pyramid Matching
Lazebnik,
Schmid,
SIFT Filter with Ponce
Features Visual Words [CVPR 2006]

Max

Multi-scale
spatial pool Classifier
(Sum)
slide credit: R. Fergus
Visualizing what was learned
• What do the learned filters look like?

Typical first layer filters

https://www.wired.com/2012/06/google-x-neural-network/
Application: ImageNet

• ~14 million labeled images, 20k classes

• Images gathered from Internet

• Human labels via Amazon Turk

[Deng et al. CVPR 2009]

https://sites.google.com/site/deeplearningcvpr2014 Slide: R. Fergus

AlexNet
• Similar framework to LeCun’98 but:
• Bigger model (7 hidden layers, 650,000 units, 60,000,000 params)
• More data (106 vs. 103 images)
• GPU implementation (50x speedup over CPU)
• Trained on two GPUs for a week

A. Krizhevsky, I. Sutskever, and G. Hinton,

ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012
Jia-Bin Huang and Derek Hoiem, UIUC
ImageNet Classification Challenge

AlexNet

http://image-net.org/challenges/talks/2016/ILSVRC2016_10_09_clsloc.pdf
Industry Deployment
• Used in Facebook, Google, Microsoft
• Image Recognition, Speech Recognition, ….
• Fast at test time

Taigman et al. DeepFace: Closing the Gap to Human-Level Performance in Face

Veriﬁcation, CVPR’14
Slide: R. Fergus
Recap
• Neural networks / multi-layer perceptrons
– View of neural networks as learning hierarchy of
features
• Convolutional neural networks
– Architecture of network accounts for image
structure
– “End-to-end” recognition from pixels
– Together with big (labeled) data and lots of
computation  major success on benchmarks,
image classification and beyond

CVlecture 5
No ratings yet
CVlecture 5
56 pages
Lec5 CNN RNN Attention
No ratings yet
Lec5 CNN RNN Attention
71 pages
Convolutional Neural Networks: Computer Vision CS 543 / ECE 549 University of Illinois Jia-Bin Huang
No ratings yet
Convolutional Neural Networks: Computer Vision CS 543 / ECE 549 University of Illinois Jia-Bin Huang
76 pages
Ch10 Deep Learning
No ratings yet
Ch10 Deep Learning
104 pages
Pattern Recognition 14
No ratings yet
Pattern Recognition 14
46 pages
Introduction To Deep Convolutional Neural Networks: March 2016
No ratings yet
Introduction To Deep Convolutional Neural Networks: March 2016
51 pages
Module11 - NNandDeep Learning
No ratings yet
Module11 - NNandDeep Learning
84 pages
Research and Prospect of Image Recognition Based o
No ratings yet
Research and Prospect of Image Recognition Based o
7 pages
Convolutional Neural Network
100% (1)
Convolutional Neural Network
78 pages
Lec 8
No ratings yet
Lec 8
60 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
38 pages
Classify Webcam Images Using Deep Learning
No ratings yet
Classify Webcam Images Using Deep Learning
17 pages
6-DeepVisualLearning L6
No ratings yet
6-DeepVisualLearning L6
82 pages
Convolutional_Networks_2024
No ratings yet
Convolutional_Networks_2024
44 pages
What Is Convolutional Neural Network
No ratings yet
What Is Convolutional Neural Network
16 pages
CNN2
No ratings yet
CNN2
70 pages
Military AI-Week 05-AI in Computer Vision
No ratings yet
Military AI-Week 05-AI in Computer Vision
65 pages
Module11 - NNandDeep Learning
No ratings yet
Module11 - NNandDeep Learning
84 pages
Unit 3
No ratings yet
Unit 3
105 pages
Introduction To Convolutional Neural Networks
No ratings yet
Introduction To Convolutional Neural Networks
41 pages
unit-3-CNN-2024
No ratings yet
unit-3-CNN-2024
58 pages
BMM 2018 - Deep Learning Tutorial
No ratings yet
BMM 2018 - Deep Learning Tutorial
47 pages
Identify Web Cam Images Using Neural Networks
No ratings yet
Identify Web Cam Images Using Neural Networks
17 pages
What is a Convolutional Neural Network-unit3.docx
No ratings yet
What is a Convolutional Neural Network-unit3.docx
12 pages
Topic 3ii - Convolutional Neural Network
No ratings yet
Topic 3ii - Convolutional Neural Network
43 pages
6S191 MIT DeepLearning L3
No ratings yet
6S191 MIT DeepLearning L3
70 pages
SP14 CS188 Lecture 23 -- Kernels and Clustering - print
No ratings yet
SP14 CS188 Lecture 23 -- Kernels and Clustering - print
39 pages
ML-13
No ratings yet
ML-13
34 pages
Deep Learning For Geometric and Semantic Tasks in Photogrammetry and Remote Sensing
No ratings yet
Deep Learning For Geometric and Semantic Tasks in Photogrammetry and Remote Sensing
11 pages
Lec 8 - CNN2
No ratings yet
Lec 8 - CNN2
60 pages
Pattern Recognition
No ratings yet
Pattern Recognition
33 pages
Convolutional Neural Networks - Part 1
No ratings yet
Convolutional Neural Networks - Part 1
44 pages
04Introduction to Neural Networks
No ratings yet
04Introduction to Neural Networks
62 pages
Image Recognition Using Neural Networks
No ratings yet
Image Recognition Using Neural Networks
18 pages
Deep Learning: Alberto Ezpondaburu
No ratings yet
Deep Learning: Alberto Ezpondaburu
58 pages
Pattern Recognition
No ratings yet
Pattern Recognition
33 pages
Convolutional Neural Networks in Python _ DataCamp
No ratings yet
Convolutional Neural Networks in Python _ DataCamp
22 pages
Deep Convolutional Neural Networks For Image Classification: Many Slides From Rob Fergus (NYU and Facebook)
No ratings yet
Deep Convolutional Neural Networks For Image Classification: Many Slides From Rob Fergus (NYU and Facebook)
55 pages
Intro to CNN
No ratings yet
Intro to CNN
93 pages
Pattern Recognition
No ratings yet
Pattern Recognition
14 pages
Sarma Cnn Vce Oct 2022
No ratings yet
Sarma Cnn Vce Oct 2022
63 pages
CNN 2
No ratings yet
CNN 2
47 pages
2EL1730 ML Lecture07 Neural Networks
No ratings yet
2EL1730 ML Lecture07 Neural Networks
65 pages
Antim Prahar AI and ML for Business 2025
No ratings yet
Antim Prahar AI and ML for Business 2025
45 pages
Convolutional Neural Networks: CMSC 733 Fall 2015 Angjoo Kanazawa
No ratings yet
Convolutional Neural Networks: CMSC 733 Fall 2015 Angjoo Kanazawa
55 pages
Ch3 CNN
No ratings yet
Ch3 CNN
64 pages
[Fall 2024] Images and Convolutions
No ratings yet
[Fall 2024] Images and Convolutions
69 pages
CV Ss16 0609 Deep Learning
No ratings yet
CV Ss16 0609 Deep Learning
91 pages
Module 3 Notes
No ratings yet
Module 3 Notes
22 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
8 pages
L09-10 DL and CNN
No ratings yet
L09-10 DL and CNN
56 pages
L11 Learning III Neural Network Architectures
No ratings yet
L11 Learning III Neural Network Architectures
35 pages
Cv Ppt Mt101
No ratings yet
Cv Ppt Mt101
16 pages
CNN and Autoencoder
No ratings yet
CNN and Autoencoder
56 pages
CNN Iitkgp
No ratings yet
CNN Iitkgp
112 pages
DL Unit-4
No ratings yet
DL Unit-4
26 pages
CS601 Machine Learning Unit 3
No ratings yet
CS601 Machine Learning Unit 3
47 pages
Week8 - Machine Learning
No ratings yet
Week8 - Machine Learning
35 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Pyramid Image Processing: Exploring the Depths of Visual Analysis
From Everand
Pyramid Image Processing: Exploring the Depths of Visual Analysis
Fouad Sabry
No ratings yet
Reference-Unit-4 (NLP - Neural Network)
No ratings yet
Reference-Unit-4 (NLP - Neural Network)
39 pages
CH 0 To 3
No ratings yet
CH 0 To 3
145 pages
Wednesday, October 25, 2023, 07:04 AM
No ratings yet
Wednesday, October 25, 2023, 07:04 AM
16 pages
Apriori Algorithm in Word File
No ratings yet
Apriori Algorithm in Word File
16 pages
eNAT Grade Level Report (Grade 5)
No ratings yet
eNAT Grade Level Report (Grade 5)
22 pages
Reinforcement Learning For Optimizing RAG For Domain Chatbots
No ratings yet
Reinforcement Learning For Optimizing RAG For Domain Chatbots
7 pages
Linear Control System Analysis and Design
No ratings yet
Linear Control System Analysis and Design
1 page
Download Full Quantum Mechanics I The Fundamentals 1st Rajasekar Solution Manual PDF All Chapters
100% (15)
Download Full Quantum Mechanics I The Fundamentals 1st Rajasekar Solution Manual PDF All Chapters
55 pages
2021 Mirhoseini
No ratings yet
2021 Mirhoseini
23 pages
A Comparison of Six Methods For Missing Data Imputation 2155 6180 1000224 PDF
No ratings yet
A Comparison of Six Methods For Missing Data Imputation 2155 6180 1000224 PDF
6 pages
Exm 2023
No ratings yet
Exm 2023
12 pages
75 Question Coding
No ratings yet
75 Question Coding
28 pages
Domalaon, Shella Jaen b. - Bsmt 3-2 ( Sto Domingo Aj
No ratings yet
Domalaon, Shella Jaen b. - Bsmt 3-2 ( Sto Domingo Aj
3 pages
Turbogenerator Self-Tuning Automatic Voltage Regulator
No ratings yet
Turbogenerator Self-Tuning Automatic Voltage Regulator
6 pages
Ee3512 Ci Lab 2021R
No ratings yet
Ee3512 Ci Lab 2021R
3 pages
DAA Assignment
No ratings yet
DAA Assignment
3 pages
Practical programs 1
No ratings yet
Practical programs 1
6 pages
MA204 2024 SM Tutoril 1
No ratings yet
MA204 2024 SM Tutoril 1
2 pages
Texture Features From Handwritten Images For Writer Identification
No ratings yet
Texture Features From Handwritten Images For Writer Identification
4 pages
Chapter 3 TELE 511
No ratings yet
Chapter 3 TELE 511
16 pages
Btech Ec 6 Sem Control System Kec602 2022
100% (1)
Btech Ec 6 Sem Control System Kec602 2022
3 pages
Untitled0.Ipynb - Colab
No ratings yet
Untitled0.Ipynb - Colab
2 pages
5) Unit Commitment
No ratings yet
5) Unit Commitment
54 pages
Literature Survey Varun Sinha
No ratings yet
Literature Survey Varun Sinha
24 pages
Notes Exam 2
No ratings yet
Notes Exam 2
6 pages
EA Black Scorpion
No ratings yet
EA Black Scorpion
3 pages
Statrevquespaper
No ratings yet
Statrevquespaper
12 pages
Outline: - Mathematical Background - PCA - SVD - Some PCA and SVD Applications - Case Study: LSI
No ratings yet
Outline: - Mathematical Background - PCA - SVD - Some PCA and SVD Applications - Case Study: LSI
42 pages
TEMPLET To Give Attainment Values
No ratings yet
TEMPLET To Give Attainment Values
4 pages
Chapter 06- Appendeix -Queuing Systems
No ratings yet
Chapter 06- Appendeix -Queuing Systems
25 pages

Lecture25 Spring2018

Uploaded by

Lecture25 Spring2018

Uploaded by

Deep learning for visual

• Kernels go beyond vector space data

Slide credit: Kristen Grauman

Optimal match: O(m3)

We introduce an approximate matching kernel that

[Previous work: Indyk & Thaper, Bartal, Charikar, Agarwal &

Feature space partitions

Slide credit: Kristen Grauman

measures number of newly matched

• For similarity, weights inversely proportional to bin size

[Grauman & Darrell, ICCV 2005] Slide credit: Kristen Grauman

Too much? Too little?

Slide credit: Kristen Grauman

[Lazebnik, Schmid & Ponce, CVPR 2006]

Sum over PMKs

[Lazebnik, Schmid & Ponce, CVPR 2006]

Image Classifier Trained

Slide credit: Jia-Bin Huang

Image Classifier Trained

SPM [Lazebnik et al. CVPR 06] Textons

and many others:

• Each layer of hierarchy extracts features from output

• Train all layers jointly

Slide: Rob Fergus

• Other domains (unclear how to hand engineer):

• Feature computation time

A biological neuron An artificial neuron (Perceptron)

Slide credit: Jia-Bin Huang

David H. Hubel and Torsten Wiesel

Suggested a hierarchy of feature detectors

Hubel and Weisel’s architecture Multi-layer Neural Network

Slide credit: Jia-Bin Huang

 If the activation is:

Slide credit: Pieter Abeel and Dan Klein

Slide credit: Pieter Abeel and Dan Klein

Slide credit: Pieter Abeel and Dan Klein

Slide credit: Pieter Abeel and Dan Klein

 Objective: a misclassification loss

Slide credit: Pieter Abeel and Dan Klein

Slide credit: Pieter Abeel and Dan Klein

Slide credit: Pieter Abeel and Dan Klein

Slide credit: Pieter Abeel and Dan Klein

Slide credit: Pieter Abeel and Dan Klein

Big labeled Deep learning

• CNN = a multi-layer neural network with

Image credit: A. Karpathy

Gradient-based learning applied to document

Input Feature Activation Map

Input Image slide credit: S. Lazebnik

Input Feature Map

Normalization Rectified Linear Unit (ReLU)

Input Image slide credit: S. Lazebnik

Convolution Max-pooling: a non-linear down-sampling

Input Image slide credit: S. Lazebnik

Feature extraction Convolution/pool

Typical first layer filters

• ~14 million labeled images, 20k classes

• Images gathered from Internet

• Human labels via Amazon Turk

[Deng et al. CVPR 2009]

https://sites.google.com/site/deeplearningcvpr2014 Slide: R. Fergus

A. Krizhevsky, I. Sutskever, and G. Hinton,

Taigman et al. DeepFace: Closing the Gap to Human-Level Performance in Face

You might also like