0% found this document useful (0 votes)

91 views

L13 Intro-Cnn Slides

The document discusses an introduction to convolutional neural networks (CNNs) lecture. It provides an overview of topics to be covered, including CNN applications in image classification, object detection, and object segmentation. Common CNN architectures like VGG16, ResNet, and Inception networks are also mentioned. The lecture aims to cover CNN basics, convolutional filters, cross-correlation vs convolution, CNNs in PyTorch and more.

Uploaded by

Mrs. Margaret Savitha P

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

91 views

L13 Intro-Cnn Slides

Uploaded by

Mrs. Margaret Savitha P

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 65

STAT 453: Introduction to Deep Learning and Generative Models

Sebastian Raschka
http://stat.wisc.edu/~sraschka/teaching

Lecture 13
Introduction to
with Applications in Python

Convolutional Neural Networks

Sebastian Raschka STAT 453: Intro to Deep Learning 1
Lecture Overview

1. What CNNs Can Do

2. Image Classi cation
3. Convolutional Neural Network Basics
4. Convolutional Filters and Weight-Sharing
5. Cross-Correlation vs Convolution
6. CNNs & Backpropagation
7. CNN Architectures
8. What a CNN Can See
9. CNNs in PyTorch

Sebastian Raschka STAT 453: Intro to Deep Learning 2

fi
Next Lecture

1. Padding (control output size in addition to stride)

2. Spatial Dropout and BatchNorm
3. Considerations for CNNs on GPUs
4. Common Architectures
A. VGG16 (simple, deep CNN)
B. ResNet and skip connections
C. Fully convolutional networks (no fully connected layers)
D. Inception (parallel convolutions and auxiliary losses)
5. Transfer learning

Sebastian Raschka STAT 453: Intro to Deep Learning 3

Common Applications of CNNs

1. What CNNs Can Do

Sebastian Raschka STAT 453: Intro to Deep Learning 4

fi
CNNs for Image Classi cation

output

Image Source:
twitter.com%2Fcats&psig=AOvVaw30_o-PCM-
K21DiMAJQimQ4&ust=1553887775741551
p(y=cat)

Image Source: https://www.pinterest.com/pin/

244742560974520446

Sebastian Raschka STAT 453: Intro to Deep Learning 5

fi
Object Detection

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Uni ed, real-time object detection. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition (pp. 779-788).

Sebastian Raschka STAT 453: Intro to Deep Learning 6

fi
Object Segmentation

umbrella.98 bus.99

umbrella.98
person1.00

person1.00
person1.00
backpack1.00
person1.00 person.99
handbag.96 person.99
person1.00 person1.00 person1.00
person1.00 person1.00
person.95 person.98
person1.00
person1.00 person1.00 person.94 person1.00 person1.00 person.89

person1.00 sheep.99
backpack.99
sheep.99 sheep.86
backpack.93 sheep.82 sheep.96
sheep.96 sheep.93 sheep.91 sheep.95 sheep.96 sheep1.00
sheep1.00
sheep.99
sheep1.00
sheep.99
sheep.96

sheep.99

person.99
bottle.99
dining table.96

bottle.99
bottle.99

person.99person1.00
person1.00
traffic light.96 tv.99

chair.98 chair.99
chair.90
dining table.99 chair.96 wine glass.97
chair.86
bottle.99wine glass.93 chair.99
bowl.85 wine glass1.00

elephant1.00
wine glass.99
wine glass1.00
person1.00 chair.96 chair.99 fork.95

person1.00 traffic light.95 bowl.81

person1.00
traffic light.92 traffic light.84
person1.00 person.85
person.96 truck1.00 person.99
motorcycle1.00 person.96person1.00
person.83 person1.00
motorcycle1.00 person.98
person.99 person.91
person.90 person.87 car.99 car.92
person.99
person.92 car.99 car.93
car1.00
motorcycle.95
knife.83

person.96

Figure 2. Mask R-CNN results on the COCO test set. These results are based on ResNet-101 [15], achieving a mask AP of 35.7 and
running at 5 fps. Masks are shown in color, and bounding box, category, and confidences are also shown.

ingly minor change, RoIAlign has a large impact: it im- 2. Related Work
He, Kaiming, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. "Mask R-CNN." In Proceedings of the IEEE International Conference on
proves mask accuracy by relative 10% to 50%, showing
Computer Vision, pp. 2961-2969. 2017. R-CNN: The Region-based CNN (R-CNN) approach [10]
bigger gains under stricter localization metrics. Second, we
found it essential to decouple mask and class prediction: we to bounding-box object detection is to attend to a manage-
predict a binary mask for each class independently, without able number of candidate object regions [33, 16] and evalu-
competition among classes, and rely on the network’s RoI ate convolutional networks [20, 19] independently on each
classification branch to predict the Sebastian
category.Raschka RoI.
In contrast,STAT 453: R-CNN
Intro was
to Deep extended [14, 9] to allow attending to RoIs 7
Learning
Face Recognition

[1]
x
<latexit sha1_base64="p8Wx+cqqkWj+1zNtDaf7R0Gpalg=">AAAB7nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGC/YA0ls122y7dbMLuRCyhP8KLB0W8+nu8+W/ctjlo64OBx3szzMwLEykMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3Q2q4FIo3UKDk7URzGoWSt8LRzdRvPXJtRKzucZzwIKIDJfqCUbRS6+kh871g0i2V3Yo7A1kmXk7KkKPeLX11ejFLI66QSWqM77kJBhnVKJjkk2InNTyhbEQH3LdU0YibIJudOyGnVumRfqxtKSQz9fdERiNjxlFoOyOKQ7PoTcX/PD/F/lWQCZWkyBWbL+qnkmBMpr+TntCcoRxbQpkW9lbChlRThjahog3BW3x5mTSrFe+8Ur27KNeu8zgKcAwncAYeXEINbqEODWAwgmd4hTcncV6cd+dj3rri5DNH8AfO5w81Jo97</latexit>

Similarity/
Distance
Score

[2]
x
<latexit sha1_base64="vzgd/QPklE2GpKgvXahAxpOTUdw=">AAAB7nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGC/YA0ls122i7dbMLuRiyhP8KLB0W8+nu8+W/ctjlo64OBx3szzMwLE8G1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1Q6pRcIkNw43AdqKQRqHAVji6mfqtR1Sax/LejBMMIjqQvM8ZNVZqPT1kfjWYdEtlt+LOQJaJl5My5Kh3S1+dXszSCKVhgmrte25igowqw5nASbGTakwoG9EB+pZKGqEOstm5E3JqlR7px8qWNGSm/p7IaKT1OAptZ0TNUC96U/E/z09N/yrIuExSg5LNF/VTQUxMpr+THlfIjBhbQpni9lbChlRRZmxCRRuCt/jyMmlWK955pXp3Ua5d53EU4BhO4Aw8uIQa3EIdGsBgBM/wCm9O4rw4787HvHXFyWeO4A+czx82rI98</latexit>

Sebastian Raschka STAT 453: Intro to Deep Learning 8

Image Synthesis

Training set
Discriminator

Real /
Generated

Noise

Generated image

Generator

Sebastian Raschka STAT 453: Intro to Deep Learning 9

Why Computer Vision is (was) Hard:
Challenges with Image Classi cation
1. What CNNs Can Do
2. Image Classi cation
3. Convolutional Neural Network Basics
4. Convolutional Filters and Weight-Sharing
5. Cross-Correlation vs Convolution
6. CNNs & Backpropagation
7. CNN Architectures
8. What a CNN Can See
9. CNNs in PyTorch

Sebastian Raschka STAT 453: Intro to Deep Learning 10

fi
fi
Why Image Classi cation is Hard

Di erent lighting, contrast, viewpoints, etc.

Image Source: Image Source: https://www.123rf.com/

twitter.com%2Fcats&psig=AOvVaw30_o-PCM- photo_76714328_side-view-of-tabby-cat-face-over-
K21DiMAJQimQ4&ust=1553887775741551 white.html

Or even simple translation This is hard for traditional

methods like multi-layer
perceptrons, because
the prediction is
basically based on a sum
of pixel intensities

Sebastian Raschka STAT 453: Intro to Deep Learning 11

ff
fi
Traditional Approaches

a) Use hand-engineered features

Sebastian Raschka STAT 453: Intro to Deep Learning 12

Traditional Approaches

a) Use hand-engineered features

Sasaki, K., Hashimoto, M., & Nagata, N. (2016). Person Invariant Classification of Subtle Facial Expressions Using Coded Movement Direction of
Keypoints. In Video Analytics. Face and Facial Expression Recognition and Audience Measurement (pp. 61-72). Springer, Cham.

Sebastian Raschka STAT 453: Intro to Deep Learning 13

Traditional Approaches

b) Preprocess images (centering, cropping, etc.)

Image Source: https://www.tokkoro.com/2827328-cat-animals-nature-feline-park-green-trees-grass.html

Sebastian Raschka STAT 453: Intro to Deep Learning 14

The Main Concepts Behind
Convolutional Neural Networks
1. What CNNs Can Do
2. Image Classi cation
3. Convolutional Neural Network Basics
4. Convolutional Filters and Weight-Sharing
5. Cross-Correlation vs Convolution
6. CNNs & Backpropagation
7. CNN Architectures
8. What a CNN Can See
9. CNNs in PyTorch

Sebastian Raschka STAT 453: Intro to Deep Learning 15

fi
https://sg n.github.io/2020/06/22/Induction-Intro/

Sebastian Raschka STAT 453: Intro to Deep Learning 16

fi
Convolutional Neural Networks

Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard and L. D. Jackel: Backpropagation Applied to
Handwritten Zip Code Recognition, Neural Computation, 1(4):541-551, Winter 1989.

Sebastian Raschka STAT 453: Intro to Deep Learning 17

Convolutional Neural Networks

C3: f. maps 16@10x10

C1: feature maps S4: f. maps 16@5x5
INPUT 6@28x28
32x32 S2: f. maps C5: layer F6: layer OUTPUT
6@14x14 120 10
84

Full connection Gaussian connections

Convolutions Subsampling Convolutions Subsampling Full connection

Yann LeCun, Léon Bottou, Yoshua Bengio and Patrick Ha ner: Gradient Based Learning Applied to Document Recognition,
Proceedings of IEEE, 86(11):2278–2324, 1998.

Sebastian Raschka STAT 453: Intro to Deep Learning 18

ff
Hidden Layers

C3: f. maps 16@10x10

C1: feature maps S4: f. maps 16@5x5
INPUT 6@28x28
32x32 S2: f. maps C5: layer F6: layer OUTPUT
6@14x14 120 10
84

Full connection Gaussian connections

Convolutions Subsampling Convolutions Subsampling Full connection

"Automatic feature extractor" "Regular classi er"

Sebastian Raschka STAT 453: Intro to Deep Learning 19

fi
Hidden Layers

C3: f. maps 16@10x10

C1: feature maps S4: f. maps 16@5x5
INPUT 6@28x28
32x32 S2: f. maps C5: layer F6: layer OUTPUT
6@14x14 120 10
84

Full connection Gaussian connections

Convolutions Subsampling Convolutions Subsampling Full connection

Each "bunch" of feature maps represents one

hidden layer in the neural network.

Counting the FC layers, this network has 5 layers

Sebastian Raschka STAT 453: Intro to Deep Learning 20

Convolutional Neural Networks
Size of the resulting layers
Number of feature detectors
C3: f. maps 16@10x10
INPUT
C1: feature maps
6@28x28
S4: f. maps 16@5x5 Multi-layer perceptron
32x32 S2: f. maps C5: layer F6: layer OUTPUT
6@14x14 120 10
84

Full connection Gaussian connections

Convolutions Subsampling Convolutions Subsampling Full connection

nowadays called "pooling" basically a fully-connected

layer + MSE loss
"Feature detectors" (weight matrices) (nowadays common to use
that are being reused ("weight sharing") fc-layer + softmax
=> also called "kernel" or " lter" + cross entropy)

Yann LeCun, Léon Bottou, Yoshua Bengio and Patrick Ha ner: Gradient Based Learning Applied to Document Recognition,
Proceedings of IEEE, 86(11):2278–2324, 1998.

Sebastian Raschka STAT 453: Intro to Deep Learning 21

fi
ff
Main Concepts Behind
Convolutional Neural Networks

• Sparse-connectivity: A single element in the feature map is

connected to only a small patch of pixels. (This is very di erent
from connecting to the whole input image, in the case of multi-
layer perceptrons.)

• Parameter-sharing: The same weights are used for di erent

patches of the input image.

• Many layers: Combining extracted local patterns to global

patterns

Sebastian Raschka STAT 453: Intro to Deep Learning 22

ff
ff
A Closer Look at the Convolutional Layer

1. What CNNs Can Do

Sebastian Raschka STAT 453: Intro to Deep Learning 23

fi
Weight Sharing
A "feature detector" ( lter, kernel) slides over the inputs to generate
a feature map

9
X
w j xj
The pixels are j=1
referred to
<latexit sha1_base64="A0KexUBWYzFCrOQ6nv7KbgccmW0=">AAAB/3icbVDLSgMxFM34rPU1KrhxEyyCqzJTBXUhFN24rGAf0I5DJk3btElmSDJqGWfhr7hxoYhbf8Odf2PazkJbD1w4nHMv994TRIwq7Tjf1tz8wuLScm4lv7q2vrFpb23XVBhLTKo4ZKFsBEgRRgWpaqoZaUSSIB4wUg8GlyO/fkekoqG40cOIeBx1Be1QjLSRfHu3pWLuJ/1zN71NzlJ47/fhg9/37YJTdMaAs8TNSAFkqPj2V6sd4pgToTFDSjVdJ9JegqSmmJE034oViRAeoC5pGioQJ8pLxven8MAobdgJpSmh4Vj9PZEgrtSQB6aTI91T095I/M9rxrpz6iVURLEmAk8WdWIGdQhHYcA2lQRrNjQEYUnNrRD3kERYm8jyJgR3+uVZUisV3aNi6fq4UL7I4siBPbAPDoELTkAZXIEKqAIMHsEzeAVv1pP1Yr1bH5PWOSub2QF/YH3+AHSflbs=</latexit>

as "receptive eld"

"feature map"

Rationale: A feature detector that works well in one region

may also work well in another region

Plus, it is a nice reduction in parameters to t

Sebastian Raschka STAT 453: Intro to Deep Learning 24

fi
fi
fi
Weight Sharing
A "feature detector" (kernel) slides over the inputs to generate
a feature map

9
X
w j xj
<latexit sha1_base64="A0KexUBWYzFCrOQ6nv7KbgccmW0=">AAAB/3icbVDLSgMxFM34rPU1KrhxEyyCqzJTBXUhFN24rGAf0I5DJk3btElmSDJqGWfhr7hxoYhbf8Odf2PazkJbD1w4nHMv994TRIwq7Tjf1tz8wuLScm4lv7q2vrFpb23XVBhLTKo4ZKFsBEgRRgWpaqoZaUSSIB4wUg8GlyO/fkekoqG40cOIeBx1Be1QjLSRfHu3pWLuJ/1zN71NzlJ47/fhg9/37YJTdMaAs8TNSAFkqPj2V6sd4pgToTFDSjVdJ9JegqSmmJE034oViRAeoC5pGioQJ8pLxven8MAobdgJpSmh4Vj9PZEgrtSQB6aTI91T095I/M9rxrpz6iVURLEmAk8WdWIGdQhHYcA2lQRrNjQEYUnNrRD3kERYm8jyJgR3+uVZUisV3aNi6fq4UL7I4siBPbAPDoELTkAZXIEKqAIMHsEzeAVv1pP1Yr1bH5PWOSub2QF/YH3+AHSflbs=</latexit>
j=1

Sebastian Raschka STAT 453: Intro to Deep Learning 25

Weight Sharing
A "feature detector" (kernel) slides over the inputs to generate
a feature map

Sebastian Raschka STAT 453: Intro to Deep Learning 26

Weight Sharing
A "feature detector" (kernel) slides over the inputs to generate
a feature map

Sebastian Raschka STAT 453: Intro to Deep Learning 27

Weight Sharing
A "feature detector" (kernel) slides over the inputs to generate
a feature map

Sebastian Raschka STAT 453: Intro to Deep Learning 28

9
X (@1)
wj xj
j=1
Multiple "feature
<latexit sha1_base64="f26ph3SsblXR0kXxlacC2FehXAE=">AAACBnicbVDLSsNAFJ3UV62vqEsRBotQNyWpgroQim5cVrAPaNMwmU7aaWeSMDNRS8jKjb/ixoUibv0Gd/6N08dCWw9cOJxzL/fe40WMSmVZ30ZmYXFpeSW7mltb39jcMrd3ajKMBSZVHLJQNDwkCaMBqSqqGGlEgiDuMVL3Blcjv35HhKRhcKuGEXE46gbUpxgpLbnmfkvG3E36F3baTs5TeO/220mhbB+l8MHtu2beKlpjwHliT0keTFFxza9WJ8QxJ4HCDEnZtK1IOQkSimJG0lwrliRCeIC6pKlpgDiRTjJ+I4WHWulAPxS6AgXH6u+JBHEph9zTnRypnpz1RuJ/XjNW/pmT0CCKFQnwZJEfM6hCOMoEdqggWLGhJggLqm+FuIcEwkonl9Mh2LMvz5NaqWgfF0s3J/ny5TSOLNgDB6AAbHAKyuAaVEAVYPAInsEreDOejBfj3fiYtGaM6cwu+APj8wfkL5gZ</latexit>

detectors" (kernels) are used

to create multiple feature
maps

9
X (@2)
w j xj
<latexit sha1_base64="nCqyd07UuJkUGWlSGLMV2F7bQVM=">AAACBnicbVDLSsNAFJ3UV62vqEsRBotQNyWpgroQim5cVrAPaNMwmU7aaSeTMDNRS8jKjb/ixoUibv0Gd/6N08dCWw9cOJxzL/fe40WMSmVZ30ZmYXFpeSW7mltb39jcMrd3ajKMBSZVHLJQNDwkCaOcVBVVjDQiQVDgMVL3Blcjv35HhKQhv1XDiDgB6nLqU4yUllxzvyXjwE36F3baTs5TeO/220mhXDpK4YPbd828VbTGgPPEnpI8mKLiml+tTojjgHCFGZKyaVuRchIkFMWMpLlWLEmE8AB1SVNTjgIinWT8RgoPtdKBfih0cQXH6u+JBAVSDgNPdwZI9eSsNxL/85qx8s+chPIoVoTjySI/ZlCFcJQJ7FBBsGJDTRAWVN8KcQ8JhJVOLqdDsGdfnie1UtE+LpZuTvLly2kcWbAHDkAB2OAUlME1qIAqwOARPINX8GY8GS/Gu/Exac0Y05ld8AfG5w/luZga</latexit>
j=1

9
X (@3)
wj xj
<latexit sha1_base64="N3BOf0nmcHBzr6vnBzaSoMFhcQo=">AAACBnicbVDLSsNAFJ34rPUVdSnCYBHqpiStoC6EohuXFewD2jRMppN22pkkzEzUErJy46+4caGIW7/BnX/j9LHQ1gMXDufcy733eBGjUlnWt7GwuLS8sppZy65vbG5tmzu7NRnGApMqDlkoGh6ShNGAVBVVjDQiQRD3GKl7g6uRX78jQtIwuFXDiDgcdQPqU4yUllzzoCVj7ib9CzttJ+cpvHf77SRfLh2n8MHtu2bOKlhjwHliT0kOTFFxza9WJ8QxJ4HCDEnZtK1IOQkSimJG0mwrliRCeIC6pKlpgDiRTjJ+I4VHWulAPxS6AgXH6u+JBHEph9zTnRypnpz1RuJ/XjNW/pmT0CCKFQnwZJEfM6hCOMoEdqggWLGhJggLqm+FuIcEwkonl9Uh2LMvz5NasWCXCsWbk1z5chpHBuyDQ5AHNjgFZXANKqAKMHgEz+AVvBlPxovxbnxMWheM6cwe+APj8wfnQ5gb</latexit>
j=1

Sebastian Raschka STAT 453: Intro to Deep Learning 29

Size Before and After Convolutions

Feature map size: kernel width

input width

padding
W K + 2P
O= +1
<latexit sha1_base64="F3e+5qMk1hWaddof/b46u0hNgJ4=">AAACBXicbZC7SgNBFIbPxluMt1VLLQaDIIhhNwraCEEbwcKI5gLJEmYns8mQ2Qszs0JYtrHxVWwsFLH1Hex8GyfJFpr4w8DHf87hzPndiDOpLOvbyM3NLywu5ZcLK6tr6xvm5lZdhrEgtEZCHoqmiyXlLKA1xRSnzUhQ7LucNtzB5ajeeKBCsjC4V8OIOj7uBcxjBCttdczdG3SO2p7AJGmgI3SNDlG5miZ3qQa7YxatkjUWmgU7gyJkqnbMr3Y3JLFPA0U4lrJlW5FyEiwUI5ymhXYsaYTJAPdoS2OAfSqdZHxFiva100VeKPQLFBq7vycS7Es59F3d6WPVl9O1kflfrRUr78xJWBDFigZkssiLOVIhGkWCukxQovhQAyaC6b8i0sc6EqWDK+gQ7OmTZ6FeLtnHpfLtSbFykcWRhx3YgwOw4RQqcAVVqAGBR3iGV3gznowX4934mLTmjGxmG/7I+PwBia6VZg==</latexit>
S

output width stride

Sebastian Raschka STAT 453: Intro to Deep Learning 30

Kernel Dimensions and Trainable Parameters

For a grayscale image with a

5x5 feature detector (kernel),
we have the following dimensions
(number of parameters to learn)

What do you think is the output size

for this 28x28 image?

Sebastian Raschka STAT 453: Intro to Deep Learning 31

CNNs and Translation/Rotation/Scale Invariance

Note that CNNs are not really invariant to scale, rotation,

translation, etc.

The activations are still

dependent on the location,
etc.

Sebastian Raschka STAT 453: Intro to Deep Learning 32

Pooling Layers Can Help With Local Invariance

Sebastian Raschka, Vahid Mirjalili. Python Machine

Learning. 3rd Edition. Birmingham, UK: Packt
Publishing, 2019. ISBN: 978-1789955750

Downside: Information is lost.

May not matter for classi cation, but applications where relative position
is important (like face recognition)
In practice for CNNs: some image preprocessing still recommended

Sebastian Raschka STAT 453: Intro to Deep Learning 33

fi
Pooling Layers Can Help With Local Invariance

Note that typical pooling layers do not have any learnable parameters

Downside: Information is lost.

May not matter for classi cation, but applications where relative position
is important (like face recognition)
In practice for CNNs: some image preprocessing still recommended

Sebastian Raschka STAT 453: Intro to Deep Learning 34

fi
Why are Convolutional Nets
Using Cross-Correlation?
1. What CNNs Can Do
2. Image Classi cation
3. Convolutional Neural Network Basics
4. Convolutional Filters and Weight-Sharing
5. Cross-Correlation vs Convolution
6. CNNs & Backpropagation
7. CNN Architectures
8. What a CNN Can See
9. CNNs in PyTorch

Sebastian Raschka STAT 453: Intro to Deep Learning 35

fi
Cross-Correlation vs Convolution

Deep Learning Jargon: convolution in DL is actually cross-correlation

Cross-correlation is our sliding dot product over the image

Z[i, j]
<latexit sha1_base64="pgja+B12YuauQl9BN2y3pM0zL0U=">AAAB7nicbVBNSwMxEJ2tX7V+VT16CRbBg5TdKuix6MVjBfuB26Vk02wbm2SXJCuUpT/CiwdFvPp7vPlvTNs9aOuDgcd7M8zMCxPOtHHdb6ewsrq2vlHcLG1t7+zulfcPWjpOFaFNEvNYdUKsKWeSNg0znHYSRbEIOW2Ho5up336iSrNY3ptxQgOBB5JFjGBjpfaDz87QY9ArV9yqOwNaJl5OKpCj0St/dfsxSQWVhnCste+5iQkyrAwjnE5K3VTTBJMRHlDfUokF1UE2O3eCTqzSR1GsbEmDZurviQwLrccitJ0Cm6Fe9Kbif56fmugqyJhMUkMlmS+KUo5MjKa/oz5TlBg+tgQTxeytiAyxwsTYhEo2BG/x5WXSqlW982rt7qJSv87jKMIRHMMpeHAJdbiFBjSBwAie4RXenMR5cd6dj3lrwclnDuEPnM8faFKO9Q==</latexit>

"feature map"

Sebastian Raschka STAT 453: Intro to Deep Learning 36

Cross-Correlation vs Convolution

"feature map"

Cross-Correlation:
k
X k
X
Z[i, j] = K[u, v]A[i + u, j + v]
<latexit sha1_base64="yrGEywl1Y1LByNhPCkHvbOrPHc4=">AAACJHicbVDJSgNBFOyJW4xb1KOXxiAIiWEmCgoSiHoRvEQwC07G0NPpSTrpWeglEIZ8jBd/xYsHFzx48VvsLIhGCxrqVb3H61duxKiQpvlhJObmFxaXksupldW19Y305lZVhIpjUsEhC3ndRYIwGpCKpJKResQJ8l1Gam7vYuTX+oQLGgY3chARx0ftgHoUI6mlZvr01qY52HWKDaH8ZqyKB73hXdwbwknd/66vbJWDfQee2TSrWTfbd5rpjJk3x4B/iTUlGTBFuZl+bbRCrHwSSMyQELZlRtKJEZcUMzJMNZQgEcI91Ca2pgHyiXDi8ZFDuKeVFvRCrl8g4Vj9OREjX4iB7+pOH8mOmPVG4n+eraR34sQ0iJQkAZ4s8hSDMoSjxGCLcoIlG2iCMKf6rxB3EEdY6lxTOgRr9uS/pFrIW4f5wvVRpnQ+jSMJdsAu2AcWOAYlcAnKoAIwuAeP4Bm8GA/Gk/FmvE9aE8Z0Zhv8gvH5BZ6lo4U=</latexit>
u= k v= k

Z[i, j] = K ⌦ A
<latexit sha1_base64="i/izFKQB/27hVxepexqMFOTaYd8=">AAAB/nicbVBNS8NAEN34WetXVDx5WSyCBylJFfQiVL0IXirYD0xD2Wy37dpNNuxOhBIK/hUvHhTx6u/w5r9x2+agrQ8GHu/NMDMviAXX4Djf1tz8wuLScm4lv7q2vrFpb23XtEwUZVUqhVSNgGgmeMSqwEGwRqwYCQPB6kH/auTXH5nSXEZ3MIiZH5JuxDucEjBSy9699/gRfvDxOb7BTQk8ZBpftOyCU3TGwLPEzUgBZai07K9mW9IkZBFQQbT2XCcGPyUKOBVsmG8mmsWE9kmXeYZGxKzx0/H5Q3xglDbuSGUqAjxWf0+kJNR6EAamMyTQ09PeSPzP8xLonPkpj+IEWEQnizqJwCDxKAvc5opREANDCFXc3IppjyhCwSSWNyG40y/Pklqp6B4XS7cnhfJlFkcO7aF9dIhcdIrK6BpVUBVRlKJn9IrerCfrxXq3Piatc1Y2s4P+wPr8AYAlk+g=</latexit>

Sebastian Raschka STAT 453: Intro to Deep Learning 37

Cross-Correlation vs Convolution

7) 8) 9)
1,-1 1,0 1,1

Sebastian Raschka STAT 453: Intro to Deep Learning 38

Cross-Correlation vs Convolution
k
X k
X
Cross-Correlation: Z[i, j] = K[u, v]A[i + u, j + v] Z[i, j] = K ⌦ A
<latexit sha1_base64="i/izFKQB/27hVxepexqMFOTaYd8=">AAAB/nicbVBNS8NAEN34WetXVDx5WSyCBylJFfQiVL0IXirYD0xD2Wy37dpNNuxOhBIK/hUvHhTx6u/w5r9x2+agrQ8GHu/NMDMviAXX4Djf1tz8wuLScm4lv7q2vrFpb23XtEwUZVUqhVSNgGgmeMSqwEGwRqwYCQPB6kH/auTXH5nSXEZ3MIiZH5JuxDucEjBSy9699/gRfvDxOb7BTQk8ZBpftOyCU3TGwLPEzUgBZai07K9mW9IkZBFQQbT2XCcGPyUKOBVsmG8mmsWE9kmXeYZGxKzx0/H5Q3xglDbuSGUqAjxWf0+kJNR6EAamMyTQ09PeSPzP8xLonPkpj+IEWEQnizqJwCDxKAvc5opREANDCFXc3IppjyhCwSSWNyG40y/Pklqp6B4XS7cnhfJlFkcO7aF9dIhcdIrK6BpVUBVRlKJn9IrerCfrxXq3Piatc1Y2s4P+wPr8AYAlk+g=</latexit>

<latexit sha1_base64="yrGEywl1Y1LByNhPCkHvbOrPHc4=">AAACJHicbVDJSgNBFOyJW4xb1KOXxiAIiWEmCgoSiHoRvEQwC07G0NPpSTrpWeglEIZ8jBd/xYsHFzx48VvsLIhGCxrqVb3H61duxKiQpvlhJObmFxaXksupldW19Y305lZVhIpjUsEhC3ndRYIwGpCKpJKResQJ8l1Gam7vYuTX+oQLGgY3chARx0ftgHoUI6mlZvr01qY52HWKDaH8ZqyKB73hXdwbwknd/66vbJWDfQee2TSrWTfbd5rpjJk3x4B/iTUlGTBFuZl+bbRCrHwSSMyQELZlRtKJEZcUMzJMNZQgEcI91Ca2pgHyiXDi8ZFDuKeVFvRCrl8g4Vj9OREjX4iB7+pOH8mOmPVG4n+eraR34sQ0iJQkAZ4s8hSDMoSjxGCLcoIlG2iCMKf6rxB3EEdY6lxTOgRr9uS/pFrIW4f5wvVRpnQ+jSMJdsAu2AcWOAYlcAnKoAIwuAeP4Bm8GA/Gk/FmvE9aE8Z0Zhv8gvH5BZ6lo4U=</latexit>
u= k v= k

k
X k
X
Convolution: Z[i, j] = K[u, v]A[i u, j v]
<latexit sha1_base64="gciLvvrtiG4n9L4bASqngj19+7w=">AAACJHicbVDJSgNBFOyJW4xb1KOXxiB4SMJMFBQkEPUieIlgFpyMoafTSTrTs9BLIAz5GC/+ihcPLnjw4rfYWRBNLGioV/Uer1+5EaNCmuankVhYXFpeSa6m1tY3NrfS2ztVESqOSQWHLOR1FwnCaEAqkkpG6hEnyHcZqbne5civ9QkXNAxu5SAijo86AW1TjKSWmumzO5tmYc8pNoTym7Eq5rzhfewN4aTu/9TXtsrCvgPPbZrTrJfrO810xsybY8B5Yk1JBkxRbqbfGq0QK58EEjMkhG2ZkXRixCXFjAxTDSVIhLCHOsTWNEA+EU48PnIID7TSgu2Q6xdIOFZ/T8TIF2Lgu7rTR7IrZr2R+J9nK9k+dWIaREqSAE8WtRWDMoSjxGCLcoIlG2iCMKf6rxB3EUdY6lxTOgRr9uR5Ui3kraN84eY4U7qYxpEEe2AfHAILnIASuAJlUAEYPIAn8AJejUfj2Xg3PiatCWM6swv+wPj6BqTHo4k=</latexit>
u= k v= k
Z[i, j] = K ⇤ A
<latexit sha1_base64="vZGNFWQeTgSkyiycymTGu76B7qc=">AAAB+HicbVDLSgNBEOyNrxgfWfXoZTAIIhJ2o6AXIepF8BLBPHCzhNnJbDJm9sHMrBCXfIkXD4p49VO8+TdOkj1oYkFDUdVNd5cXcyaVZX0buYXFpeWV/GphbX1js2hubTdklAhC6yTikWh5WFLOQlpXTHHaigXFgcdp0xtcjf3mIxWSReGdGsbUDXAvZD4jWGmpYxbvHXaEHlx0jm7QIbromCWrbE2A5omdkRJkqHXMr3Y3IklAQ0U4ltKxrVi5KRaKEU5HhXYiaYzJAPeoo2mIAyrddHL4CO1rpYv8SOgKFZqovydSHEg5DDzdGWDVl7PeWPzPcxLln7kpC+NE0ZBMF/kJRypC4xRQlwlKFB9qgolg+lZE+lhgonRWBR2CPfvyPGlUyvZxuXJ7UqpeZnHkYRf24ABsOIUqXEMN6kAggWd4hTfjyXgx3o2PaWvOyGZ24A+Mzx9QvZDp</latexit>
9) 8) 7)
-1,-1 -1,0 -1,1
Basically, we are ipping the kernel (or the
6) 5) 4)
receptive eld) horizontally and vertically
0,-1 0,0 0,1

3) 2) 1)
1,-1 1,0 1,1

Sebastian Raschka STAT 453: Intro to Deep Learning 39

fi
fl
Sebastian Raschka STAT 453: Intro to Deep Learning 40
Sebastian Raschka STAT 453: Intro to Deep Learning 41
Cross-Correlation vs Convolution

Deep Learning Jargon: convolution in DL is actually cross-correlation

"Real" convolution has the nice associative property:

(A ⇤ B) ⇤ C = A ⇤ (B ⇤ C)
<latexit sha1_base64="NmDEdG9hUsHl7PiAA9yVIT7ok0s=">AAAB+nicbVDLSsNAFJ3UV62vVJduBovQZlGSKuhG6GPjsoJ9QBvKZDpph04mYWailNhPceNCEbd+iTv/xmmbhbYeuHA4517uvceLGJXKtr+NzMbm1vZOdje3t39weGTmj9syjAUmLRyyUHQ9JAmjnLQUVYx0I0FQ4DHS8SaNud95IELSkN+raUTcAI049SlGSksDM1+sWfWS1YA3sGYV61ajNDALdtleAK4TJyUFkKI5ML/6wxDHAeEKMyRlz7Ej5SZIKIoZmeX6sSQRwhM0Ij1NOQqIdJPF6TN4rpUh9EOhiyu4UH9PJCiQchp4ujNAaixXvbn4n9eLlX/tJpRHsSIcLxf5MYMqhPMc4JAKghWbaoKwoPpWiMdIIKx0WjkdgrP68jppV8rORblyd1mo1tM4suAUnIEicMAVqIJb0AQtgMEjeAav4M14Ml6Md+Nj2Zox0pkT8AfG5w8dmJCs</latexit>

In DL, we usually don't care about that (as opposed to many traditional
computer vision and signal processing applications).

Also, cross-correlation is easier to implement.

Maybe the term "convolution" for cross-correlation became popular,

because "Cross-Correlational Neural Network" sounds weird ;)

Sebastian Raschka STAT 453: Intro to Deep Learning 42

How does Backpropagation
Work in CNNs?
1. What CNNs Can Do
2. Image Classi cation
3. Convolutional Neural Network Basics
4. Convolutional Filters and Weight-Sharing
5. Cross-Correlation vs Convolution
6. CNNs & Backpropagation
7. CNN Architectures
8. What a CNN Can See
9. CNNs in PyTorch

Sebastian Raschka STAT 453: Intro to Deep Learning 43

fi
Backpropagation in CNNs

Same overall concept as before: Multivariable chain rule,

but now with an additional weight sharing constraint

Sebastian Raschka STAT 453: Intro to Deep Learning 44

Remember Lecture 6? Graph with Weight
Sharing
1 (z1 ) = a1
@a1 <latexit sha1_base64="Kk055JsIihKrUR2Qz/L8arieVek=">AAAB/HicbZDLSsNAFIZP6q3WW7RLN4NFqJuSqKAboejGZQV7gTaEyXTSDp1MwsxEiKW+ihsXirj1Qdz5Nk7bLLT1h4GP/5zDOfMHCWdKO863VVhZXVvfKG6WtrZ3dvfs/YOWilNJaJPEPJadACvKmaBNzTSnnURSHAWctoPRzbTefqBSsVjc6yyhXoQHgoWMYG0s3y73FBtE2Herj757gq6QQd+uODVnJrQMbg4VyNXw7a9ePyZpRIUmHCvVdZ1Ee2MsNSOcTkq9VNEEkxEe0K5BgSOqvPHs+Ak6Nk4fhbE0T2g0c39PjHGkVBYFpjPCeqgWa1Pzv1o31eGlN2YiSTUVZL4oTDnSMZomgfpMUqJ5ZgATycytiAyxxESbvEomBHfxy8vQOq25ZzXn7rxSv87jKMIhHEEVXLiAOtxCA5pAIINneIU368l6sd6tj3lrwcpnyvBH1ucPPZSTMQ==</latexit>

x1 w1 <latexit sha1_base64="4Z/KrxxA+GlxhVaTOyJaNHrf4VU=">AAACC3icbVC7TsMwFHV4lvIKMLJYrZCYqgSQYKxgYSwSfUhtFN24TmvVcSLbAVVRdxZ+hYUBhFj5ATb+BqeNBLQcydK559xr+54g4Uxpx/mylpZXVtfWSxvlza3tnV17b7+l4lQS2iQxj2UnAEU5E7Spmea0k0gKUcBpOxhd5X77jkrFYnGrxwn1IhgIFjIC2ki+XemFEkjWS0BqBhyD705+qntTYd+uOjVnCrxI3IJUUYGGb3/2+jFJIyo04aBU13US7WX5nYTTSbmXKpoAGcGAdg0VEFHlZdNdJvjIKH0cxtIcofFU/T2RQaTUOApMZwR6qOa9XPzP66Y6vPAyJpJUU0FmD4UpxzrGeTC4zyQlmo8NASKZ+SsmQzDhaBNf2YTgzq+8SFonNfe05tycVeuXRRwldIgq6Bi56BzV0TVqoCYi6AE9oRf0aj1az9ab9T5rXbKKmQP0B9bHN7v7mtM=</latexit>
@w1
a1 @o
y
<latexit sha1_base64="cs1Q9fet/6GNtc+Tzw/y6WCTX8Y=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0Io/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHkyXoR3QoecgZNVZqZP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03Mf6EKsOZwGmpl2pMKBvTIXYtlTRC7U/mh07JmVUGJIyVLWnIXP09MaGR1lkU2M6ImpFe9mbif143NeGNP+EySQ1KtlgUpoKYmMy+JgOukBmRWUKZ4vZWwkZUUWZsNiUbgrf88ippXVS9y6rbuKrUbvM4inACp3AOHlxDDe6hDk1ggPAMr/DmPDovzrvzsWgtOPnMMfyB8/kD6GeM/w==</latexit>

<latexit sha1_base64="BOU8IhEf1nCpOTJ2JoJhJKmU0Z0=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0m0oMeiF48V7Qe0oUy2m3bpZhN2N0IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqKGvSWMSqE6BmgkvWNNwI1kkUwygQrB2Mb2d++4kpzWP5aCYJ8yMcSh5yisZKD9j3+uWKW3XnIKvEy0kFcjT65a/eIKZpxKShArXuem5i/AyV4VSwaamXapYgHeOQdS2VGDHtZ/NTp+TMKgMSxsqWNGSu/p7IMNJ6EgW2M0Iz0sveTPzP66YmvPYzLpPUMEkXi8JUEBOT2d9kwBWjRkwsQaq4vZXQESqkxqZTsiF4yy+vktZF1busuve1Sv0mj6MIJ3AK5+DBFdThDhrQBApDeIZXeHOE8+K8Ox+L1oKTzxzDHzifP+lBjYs=</latexit>

<latexit sha1_base64="5HJHR/B9CHeIlPgqihTyAybn2c4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEMWo2i</latexit>

@l
<latexit sha1_base64="ELWCbynYAUOpCjzaHAkeFZeonCw=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G6WE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEK1I2h</latexit>

<latexit sha1_base64="A5BNLDamJxmqDahJkf2wo8PNgJk=">AAACCXicbZBPS8MwGMbT+W/Of1WPXoJD8DRaFfQ49OJxgtuErZS3WbqFpWlJUmGUXr34Vbx4UMSr38Cb38Z0K6ibDwR+PO/7JnmfIOFMacf5sipLyyura9X12sbm1vaOvbvXUXEqCW2TmMfyLgBFORO0rZnm9C6RFKKA024wvirq3XsqFYvFrZ4k1ItgKFjICGhj+TbuhxJI1k9AagYcx/kPg+/m2LfrTsOZCi+CW0IdlWr59md/EJM0okITDkr1XCfRXlbcSTjNa/1U0QTIGIa0Z1BARJWXTTfJ8ZFxBjiMpTlC46n7eyKDSKlJFJjOCPRIzdcK879aL9XhhZcxkaSaCjJ7KEw51jEuYsEDJinRfGIAiGTmr5iMwESjTXg1E4I7v/IidE4a7mnDuTmrNy/LOKroAB2iY+Sic9RE16iF2oigB/SEXtCr9Wg9W2/W+6y1YpUz++iPrI9vdbeaJw==</latexit>
@a1
<latexit sha1_base64="EBSdFgeDtyPz9IVeKoGbRsZIvwA=">AAACB3icbZDLSsNAFIZP6q3WW9SlIINFcFUSFXRZdOOygr1AG8pkOmmHTiZhZiKUkJ0bX8WNC0Xc+grufBsnbUBt/WHg4z/nzMz5/ZgzpR3nyyotLa+srpXXKxubW9s79u5eS0WJJLRJIh7Jjo8V5UzQpmaa004sKQ59Ttv++Dqvt++pVCwSd3oSUy/EQ8ECRrA2Vt8+7AUSk7QXY6kZ5ohnPxxlqG9XnZozFVoEt4AqFGr07c/eICJJSIUmHCvVdZ1Ye2l+I+E0q/QSRWNMxnhIuwYFDqny0ukeGTo2zgAFkTRHaDR1f0+kOFRqEvqmM8R6pOZruflfrZvo4NJLmYgTTQWZPRQkHOkI5aGgAZOUaD4xgIlk5q+IjLAJRpvoKiYEd37lRWid1tyzmnN7Xq1fFXGU4QCO4ARcuIA63EADmkDgAZ7gBV6tR+vZerPeZ60lq5jZhz+yPr4BSfGZjg==</latexit>
@o L(y, o) = l
o
<latexit sha1_base64="xkDVhV2R7yGjiI8Bkoa6EodHAlw=">AAAB/nicbVDLSsNAFL2pr1pfUXHlZrAIFaQkKuhGKLpx4aKCfUAbymQ6aYdOJmFmIpRQ8FfcuFDErd/hzr9x0mah1QMDh3Pu5Z45fsyZ0o7zZRUWFpeWV4qrpbX1jc0te3unqaJEEtogEY9k28eKciZoQzPNaTuWFIc+py1/dJ35rQcqFYvEvR7H1AvxQLCAEayN1LP3uiHWQ4J5ejupjI9RdIQuEe/ZZafqTIH+EjcnZchR79mf3X5EkpAKTThWquM6sfZSLDUjnE5K3UTRGJMRHtCOoQKHVHnpNP4EHRqlj4JImic0mqo/N1IcKjUOfTOZhVXzXib+53USHVx4KRNxoqkgs0NBwpGOUNYF6jNJieZjQzCRzGRFZIglJto0VjIluPNf/kuaJ1X3tOrcnZVrV3kdRdiHA6iAC+dQgxuoQwMIpPAEL/BqPVrP1pv1PhstWPnOLvyC9fENUuuUZw==</latexit>

<latexit sha1_base64="zmvhV5w6wvufBjgJnplzs3qmpp8=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69BIvgqSQq6LHoxWML9gPaUDbbSbt2sxt2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZemHCmjed9O4W19Y3NreJ2aWd3b/+gfHjU0jJVFJtUcqk6IdHImcCmYYZjJ1FI4pBjOxzfzfz2EyrNpHgwkwSDmAwFixglxkoN2S9XvKo3h7tK/JxUIEe9X/7qDSRNYxSGcqJ11/cSE2REGUY5Tku9VGNC6JgMsWupIDHqIJsfOnXPrDJwI6lsCePO1d8TGYm1nsSh7YyJGellbyb+53VTE90EGRNJalDQxaIo5a6R7uxrd8AUUsMnlhCqmL3VpSOiCDU2m5INwV9+eZW0Lqr+ZdVrXFVqt3kcRTiBUzgHH66hBvdQhyZQQHiGV3hzHp0X5935WLQWnHzmGP7A+fwB2T+M9Q==</latexit>

l <latexit sha1_base64="E5Kc1ZKr520j8ga7QDzfGA0mefk=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHM0nQj+hQ8pAzaqzUEP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03MX5GleFM4LTUSzUmlI3pELuWShqh9rP5oVNyZpUBCWNlSxoyV39PZDTSehIFtjOiZqSXvZn4n9dNTXjjZ1wmqUHJFovCVBATk9nXZMAVMiMmllCmuL2VsBFVlBmbTcmG4C2/vEpaF1Xvsuo2riq12zyOIpzAKZyDB9dQg3uoQxMYIDzDK7w5j86L8+58LFoLTj5zDH/gfP4A1LOM8g==</latexit>

w 1 · x 1 = z1
3 (a1 , a2 ) =o
<latexit sha1_base64="+e1bOL2+yE8wQHw7R7Wi1lbuH7o=">AAAB/HicbZDLSsNAFIYn9VbrLdqlm8EiuCqJCroRim5cVrAXaEOYTCbt0MlMmJmoMdRXceNCEbc+iDvfxmmbhbb+MPDxn3M4Z/4gYVRpx/m2SkvLK6tr5fXKxubW9o69u9dWIpWYtLBgQnYDpAijnLQ01Yx0E0lQHDDSCUZXk3rnjkhFBb/VWUK8GA04jShG2li+Xb33XdjHodDwwdAFfPRd3645dWcquAhuATVQqOnbX/1Q4DQmXGOGlOq5TqK9HElNMSPjSj9VJEF4hAakZ5CjmCgvnx4/hofGCWEkpHlcw6n7eyJHsVJZHJjOGOmhmq9NzP9qvVRH515OeZJqwvFsUZQyqAWcJAFDKgnWLDOAsKTmVoiHSCKsTV4VE4I7/+VFaB/X3ZO6c3Naa1wWcZTBPjgAR8AFZ6ABrkETtAAGGXgGr+DNerJerHfrY9ZasoqZKvgj6/MHXCGTRw==</latexit>

x2 w1 a2
<latexit sha1_base64="MADeaGuZ5x0zDKtg8jYEX763ets=">AAAB/3icbVBNS8NAEN3Ur1q/ooIXL4tFqCAlaQW9CEUvHivYD2hDmGy37dJNNuxuhBJ78K948aCIV/+GN/+N2zYHrT4YeLw3w8y8IOZMacf5snJLyyura/n1wsbm1vaOvbvXVCKRhDaI4EK2A1CUs4g2NNOctmNJIQw4bQWj66nfuqdSMRHd6XFMvRAGEeszAtpIvn3QVWwQgl8tge+eYvArJ/gSC98uOmVnBvyXuBkpogx13/7s9gRJQhppwkGpjuvE2ktBakY4nRS6iaIxkBEMaMfQCEKqvHR2/wQfG6WH+0KaijSeqT8nUgiVGoeB6QxBD9WiNxX/8zqJ7l94KYviRNOIzBf1E461wNMwcI9JSjQfGwJEMnMrJkOQQLSJrGBCcBdf/kualbJbLTu3Z8XaVRZHHh2iI1RCLjpHNXSD6qiBCHpAT+gFvVqP1rP1Zr3PW3NWNrOPfsH6+Aa4qZP0</latexit>

<latexit sha1_base64="kbSwglAxLszB5oa4qHODldzc1d4=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkR9Vj04rGi/YA2lM120y7dbMLuRCyhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVindRce/Oy7XrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AEO2o2m</latexit>

<latexit sha1_base64="ELWCbynYAUOpCjzaHAkeFZeonCw=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G6WE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEK1I2h</latexit>

@a2 <latexit sha1_base64="i3V+Hv5zU7q3mZ39kUdT/gvBnUk=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeiF48V7Qe0oWy2k3bpZhN2N0IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3n1BpHstHM0nQj+hQ8pAzaqz0QPu1frniVt05yCrxclKBHI1++as3iFkaoTRMUK27npsYP6PKcCZwWuqlGhPKxnSIXUsljVD72fzUKTmzyoCEsbIlDZmrvycyGmk9iQLbGVEz0sveTPzP66YmvPYzLpPUoGSLRWEqiInJ7G8y4AqZERNLKFPc3krYiCrKjE2nZEPwll9eJa1a1buouveXlfpNHkcRTuAUzsGDK6jDHTSgCQyG8Ayv8OYI58V5dz4WrQUnnzmGP3A+fwDqxY2M</latexit>

<latexit sha1_base64="UfwdaiaQv7vCe0d17Hqvq1wxekM=">AAACC3icbVDLSgMxFM3UV62vUZduQovgqsxUQZdFNy4r2Ad0huFOmmlDMw+SjFKG7t34K25cKOLWH3Dn35hpB9TWA4GTc+69yT1+wplUlvVllFZW19Y3ypuVre2d3T1z/6Aj41QQ2iYxj0XPB0k5i2hbMcVpLxEUQp/Trj++yv3uHRWSxdGtmiTUDWEYsYARUFryzKoTCCCZk4BQDDgGrzH9ud179hR7Zs2qWzPgZWIXpIYKtDzz0xnEJA1ppAgHKfu2lSg3y2cSTqcVJ5U0ATKGIe1rGkFIpZvNdpniY60McBALfSKFZ+rvjgxCKSehrytDUCO56OXif14/VcGFm7EoSRWNyPyhIOVYxTgPBg+YoETxiSZABNN/xWQEOhyl46voEOzFlZdJp1G3T+vWzVmteVnEUUZHqIpOkI3OURNdoxZqI4Ie0BN6Qa/Go/FsvBnv89KSUfQcoj8wPr4BvY+a1A==</latexit>
@w1 <latexit sha1_base64="UHK8Rq4ihGcm4gMByPiIvg3wEO0=">AAAB/HicbZDLSsNAFIZPvNZ6i3bpZrAIdVOSKuhGKLpxWcFeoA1hMp20Q2eSMDMRYqmv4saFIm59EHe+jdM2C239YeDjP+dwzvxBwpnSjvNtrayurW9sFraK2zu7e/v2wWFLxakktEliHstOgBXlLKJNzTSnnURSLAJO28HoZlpvP1CpWBzd6yyhnsCDiIWMYG0s3y71FBsI7Ncqj757iq6QQd8uO1VnJrQMbg5lyNXw7a9ePyapoJEmHCvVdZ1Ee2MsNSOcToq9VNEEkxEe0K7BCAuqvPHs+Ak6MU4fhbE0L9Jo5v6eGGOhVCYC0ymwHqrF2tT8r9ZNdXjpjVmUpJpGZL4oTDnSMZomgfpMUqJ5ZgATycytiAyxxESbvIomBHfxy8vQqlXds6pzd16uX+dxFOAIjqECLlxAHW6hAU0gkMEzvMKb9WS9WO/Wx7x1xcpnSvBH1ucPQKeTMw==</latexit>
2 (z1 ) = a2

Upper path
@l @l @o @a1 @l @o @a2
<latexit sha1_base64="TYYPnCIxpTv7H9H6wB8RxqOKTTU=">AAAC9XicrVLLSsNAFJ3EV019VF26GSyCIJSkCroRim5cVrAPaEqZTCc6OMmEmYlSQv7DjQtF3Pov7vwbJ2nBmpauvDBwOPfce2buXC9iVCrb/jbMpeWV1bXSulXe2NzaruzstiWPBSYtzBkXXQ9JwmhIWooqRrqRICjwGOl4D1dZvvNIhKQ8vFWjiPQDdBdSn2KkNDXYMcquLxBO3AgJRRGDLP3FTwMnhdYFXCDhKXTxkCtoQVjU8SkdylrlyjnCLFlwPbb+xbS+yLReMLUGlapds/OAs8CZgCqYRHNQ+XKHHMcBCRVmSMqeY0eqn2Q9MSOp5caSRAg/oDvS0zBEAZH9JP+1FB5qZgh9LvQJFczZ6YoEBVKOAk8rA6TuZTGXkfNyvVj55/2EhlGsSIjHRn7MoOIwWwE4pIJgxUYaICyovivE90gPR+lFyYbgFJ88C9r1mnNSs29Oq43LyThKYB8cgCPggDPQANegCVoAG8J4Nl6NN/PJfDHfzY+x1DQmNXvgT5ifPxdI7YY=</latexit>
@w1
= · ·
@o @a1 @w1
+ · ·
@o @a2 @w1 (multivariable chain rule)
Lower path

Sebastian Raschka STAT 453: Intro to Deep Learning 45

Backpropagation in CNNs

Same overall concept as before: Multivariable chain rule,

but now with an additional weight sharing constraint

Due to weight sharing: w1 = w2 <latexit sha1_base64="FKGwU2qvUIEYUjcCdiIWnsvCgXU=">AAAB8HicbVBNSwMxEJ31s9avqkcvwSJ4KrtV0ItQ9OKxgv2QdlmyabYNTbJLkrWUpb/CiwdFvPpzvPlvTNs9aOuDgcd7M8zMCxPOtHHdb2dldW19Y7OwVdze2d3bLx0cNnWcKkIbJOaxaodYU84kbRhmOG0nimIRctoKh7dTv/VElWaxfDDjhPoC9yWLGMHGSo+jwEPXaBRUg1LZrbgzoGXi5aQMOepB6avbi0kqqDSEY607npsYP8PKMMLppNhNNU0wGeI+7VgqsaDaz2YHT9CpVXooipUtadBM/T2RYaH1WIS2U2Az0IveVPzP66QmuvIzJpPUUEnmi6KUIxOj6feoxxQlho8twUQxeysiA6wwMTajog3BW3x5mTSrFe+8Ur2/KNdu8jgKcAwncAYeXEIN7qAODSAg4Ble4c1Rzovz7nzMW1ecfOYI/sD5/AFAMo9k</latexit>

w1
<latexit sha1_base64="BGIvv1Que1aISVw+1pGEuT4uC1M=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCmhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qnn9Uplt+LOQJaJl5My5Kj3Sl/dfszSiCtkkhrT8dwE/YxqFEzySbGbGp5QNqID3rFU0YgbP5udOiGnVumTMNa2FJKZ+nsio5Ex4yiwnRHFoVn0puJ/XifF8MrPhEpS5IrNF4WpJBiT6d+kLzRnKMeWUKaFvZWwIdWUoU2naEPwFl9eJs1qxTuvVO8uyrXrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AELeI2j</latexit>

w2
<latexit sha1_base64="k9TH6JRVGzznxlg0BHK2AhK6Dh8=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCmhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVindeqd5dlGvXeRwFOIYTOAMPLqEGt1CHBjAYwDO8wpsjnRfn3fmYt644+cwR/IHz+QMM/I2k</latexit>

Optional averaging
weight update: ✓ ◆
1 @L @L
w1 := w2 := w1 ⌘· +
<latexit sha1_base64="MYHrCBQOQYN/sQ1qQKoAWS1dkO0=">AAACenicjVFNa9tAEF2pSZs6Ses2x1IYYvJFqJHcQEugYNJLDz0kECcBy4jReuUsWX2wO6oxQj8ify23/pJeesjKVku+DhlY9vFm5u3smyhX0pDn/XbcF0vLL1+tvG6trq2/edt+9/7MZIXmYsAzlemLCI1QMhUDkqTERa4FJpES59HV9zp//ktoI7P0lGa5GCU4SWUsOZKlwvb1NPTh8BtMw97i8uETBIIQIODjjCCINfLSr8peBUEkJ5PdhoIgR00SFQQJ0iVHVf6sqvI/a6Uq2H928T/5vbDd8brePOAx8BvQYU0ch+2bYJzxIhEpcYXGDH0vp1FZK3MlqlZQGJEjv8KJGFqYYiLMqJxbV8GWZcYQZ9qelGDO3u0oMTFmlkS2sh7cPMzV5FO5YUHx11Ep07wgkfLFQ3GhgDKo9wBjqQUnNbMAuZZ2VuCXaL0iu62WNcF/+OXH4KzX9T93eycHnf5RY8cK+8A22S7z2RfWZz/YMRswzv44H51tZ8f56266e+7+otR1mp4Ndi/cg1vG1r5V</latexit>
2 @w1 @w2
Sebastian Raschka STAT 453: Intro to Deep Learning 46
What Are Some of the Common
Cnn Architectures?
1. What CNNs Can Do
2. Image Classi cation
3. Convolutional Neural Network Basics
4. Convolutional Filters and Weight-Sharing
5. Cross-Correlation vs Convolution
6. CNNs & Backpropagation
7. CNN Architectures
8. What a CNN Can See
9. CNNs in PyTorch

Sebastian Raschka STAT 453: Intro to Deep Learning 47

fi
Main Breakthrough for CNNs:
AlexNet & ImageNet

Figure 2: An illustration of the architecture of our CNN, explicitly showing the delineation of responsibilities
between the two GPUs. One GPU runs the layer-parts at the top of the figure while the other runs the layer-parts
at the bottom. The GPUs communicate only at certain layers. The network’s input is 150,528-dimensional, and
the number of neurons in the network’s remaining layers is given by 253,440–186,624–64,896–64,896–43,264–
4096–4096–1000.

neurons in a kernel map). The second convolutional layer takes as input the (response-normalized
AlexNet
and pooled) achieved 15.4%
output of the first error layer
convolutional on top-5
and filtersinit 2012
with 256 kernels of size 5 ⇥ 5 ⇥ 48.
The third, fourth, and fifth convolutional layers are connected to one another without any intervening
2nd best
pooling was not layers.
or normalization even close:
The third26.2%
convolutional layer has 384 kernels of size 3 ⇥ 3 ⇥
(nowadays
256 connected to~3% error onpooled)
the (normalized, ImageNet)
outputs of the second convolutional layer. The fourth
convolutional layer has 384 kernels of size 3 ⇥ 3 ⇥ 192 , and the fifth convolutional layer has 256
kernels of size 3 ⇥ 3 ⇥ 192. The fully-connected layers have 4096 neurons each.

Krizhevsky, A., Sutskever, I., & Hinton,

4 Reducing G. E. (2012). ImageNet classification with deep convolutional neural
Overfitting
networks. In AdvancesOur
in neural
neural information processing systems (pp. 1097-1105).
network architecture has 60 million parameters. Although the 1000 classes of ILSVRC
make each training example impose 10 bits of constraint on the mapping from image to label, this
turns out to be insufficient to learn so many parameters without considerable overfitting. Below, we
describe the two primary ways in which we combat overfitting.
Sebastian Raschka STAT 453: Intro to Deep Learning 48
Main Breakthrough for CNNs:
AlexNet & ImageNet

Stevens, Eli, Luca Antiga, and Thomas Viehmann. Deep learning with PyTorch. Manning Publications, 2020

Sebastian Raschka STAT 453: Intro to Deep Learning 49

rward pass and do not participate in back-
ral network samples a different architecture,
Main Breakthrough for CNNs:
reduces complex co-adaptations of neurons,
ar other neurons. It is, therefore, forced to
n with many different random subsets of the
AlexNet & ImageNet
ut multiply their outputs by 0.5, which is a
of the predictive distributions produced by

Figure 2. Without dropout, our network ex- Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009).
he number of iterations required to converge. Imagenet: A large-scale hierarchical image database. In 2009 IEEE
conference on computer vision and pattern recognition (pp. 248–255).

nt
nd
nt
In Figure 3: 96 convolutional kernels of size
er: 11⇥11⇥3 learned by the first convolutional
or layer on the 224⇥224⇥3 input images. The Figure 4: (Left) Eight ILSVRC-2010 test images and the five labels considered most probable by our model.
The correct label is written under each image, and the probability assigned to the correct label is also shown
top 48 kernels were learned on GPU 1 while with a red bar (if it happens to be in the top 5). (Right) Five ILSVRC-2010 test images in the first column. The
the bottom 48 kernels were learned on GPU remaining columns show the six training images that produce feature vectors in the last hidden layer with the
smallest Euclidean distance from the feature vector for the test image.
2. See Section 6.1 for details.
i
In the left panel of Figure 4 we qualitatively assess what the network has learned by computing its
top-5 predictions on eight test images. Notice that even off-center objects, such as the mite in the
D E top-left, can be recognized by the net. Most of the top-5 labels appear reasonable. For example,
ble, ✏ is the learning rate, and @w
@L
wi D
is only other types of cat are considered plausible labels for the leopard. In some cases (grille, cherry)
there is genuine ambiguity about the intended focus of the photograph.
he objective with respect to w, evaluated at G. E.
Krizhevsky, A., Sutskever, I., & Hinton, (2012). Imagenet
the network’s classification with thedeepfeatureconvolutional neural
i
Another way to probe visual knowledge is to consider activations induced
networks. In Advances in neural information processing
by an systems (pp.
image at the last, 4096-dimensional 1097-1105).
hidden layer. If two images produce feature activation
vectors with a small Euclidean separation, we can say that the higher levels of the neural network
consider them to be similar. Figure 4 shows five images from the test set and the six images from
ean Gaussian distribution with standard de- the training set that are most similar to each of them according to this measure. Notice that at the
pixel level, the retrieved training images are generally not close in L2 to the query images in the first
cond, fourth, and fifth convolutional layers, column. For example, the retrieved dogs and elephants appear in a variety of poses. We present the
e constant 1. This initialization accelerates
Sebastian Raschka
results for manySTAT 453:
more test Introinto
images theDeep Learning
supplementary material. 50
Main Breakthrough for CNNs:
AlexNet & ImageNet
The ImageNet set that was used
has ~1.2 million
images and 1000 classes

Accuracy is measured as top-5

performance:
Correct prediction if the true
label matches one of the top 5
predictions of the model

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural
FigureIn4:
networks. (Left) in
Advances Eight
neuralILSVRC-2010 test images
information processing and
systems the1097-1105).
(pp. five labels considered most prob
The correct label is written under each image, and the probability assigned to the correct
with a red bar (if it happens to be in the top 5). (Right) Five ILSVRC-2010 test images in th
remaining columns showSebastian
the sixRaschka
training STAT that produce feature vectors in the last 51
453: Intro to Deep Learning
images hid
Main Breakthrough for CNNs:
AlexNet & ImageNet
Note that the actual network
inputs were still 224x224
images
(random crops from
downsampled 256x256
images)

224x224 is still a good/

reasonable size today
(224*224*3 = 150,528 features)

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural
FigureIn4:
networks. (Left) in
Advances Eight
neuralILSVRC-2010 test images
information processing and
systems the1097-1105).
(pp. five labels considered most prob
The correct label is written under each image, and the probability assigned to the correct
with a red bar (if it happens to be in the top 5). (Right) Five ILSVRC-2010 test images in th
remaining columns showSebastian
the sixRaschka
training STAT that produce feature vectors in the last 52
453: Intro to Deep Learning
images hid
Common CNN Architectures

Figure 1: Top1 vs. network. Single-crop top-1 vali- Figure 2: Top1 vs. operations, size / parameters.
dation accuracies for top scoring single-model archi- Top-1 one-crop accuracy versus amount of operations
tectures. We introduce with this chart our choice of required for a single forward pass. The size of the
colour scheme, which will be used throughout this blobs is proportional to the number of network pa-
publication to distinguish effectively different archi- rameters; a legend is reported in the bottom right cor-
tectures and their correspondent authors. Notice that ner, spanning from 5⇥106 to 155⇥106 params. Both
networks of the same group share the same hue, for these figures share the same y-axis, and the grey dots
example ResNet are all variations of pink. highlight the centre of the blobs.

Canziani, A., Paszke, A., & Culurciello, E. (2016). An analysis of deep neural network models for practical
single arXiv
applications. VGG-161arXiv:1605.07678.
run of preprint (Simonyan & Zisserman, 2014) and GoogLeNet (Szegedy et al., 2014) are
8.70% and 10.07% respectively, revealing that VGG-16 performs better than GoogLeNet. When
2
models are run with 10-crop sampling, then the errors become 9.33% and 9.15% respectively, and
Sebastian Raschka STAT 453: Intro to Deep Learning
therefore VGG-16 will perform worse than GoogLeNet, using a single central-crop. For this reason, 53
Convolutions with Color Channels

Sebastian Raschka, Vahid Mirjalili. Python Machine

Learning. 3rd Edition. Birmingham, UK: Packt
Publishing, 2019. ISBN: 978-1789955750

Image dimension: X 2 Rn1 ⇥n2 ⇥cin in NWHC format,

<latexit sha1_base64="GzJ9t8GJxsmHc4EQcbLCmf2kfPA=">AAACIXicbVDLSsNAFJ34rPUVdelmsAiuSlIFuyy6cVnFPqCpYTKdtEMnkzBzI5SQX3Hjr7hxoUh34s84fQjaemDg3HPv5c45QSK4Bsf5tFZW19Y3Ngtbxe2d3b19++CwqeNUUdagsYhVOyCaCS5ZAzgI1k4UI1EgWCsYXk/6rUemNI/lPYwS1o1IX/KQUwJG8u2qFxEYBGHWzrHHJZ6VQXaXP2TSd7EHPGIaS7/yQ6mfcZnnvl1yys4UeJm4c1JCc9R9e+z1YppGTAIVROuO6yTQzYgCTgXLi16qWULokPRZx1BJzLFuNnWY41Oj9HAYK/Mk4Kn6eyMjkdajKDCTEwN6sTcR/+t1UgirXWMoSYFJOjsUpgJDjCdx4R5XjIIYGUKo4uavmA6IIhRMqEUTgrtoeZk0K2X3vFy5vSjVruZxFNAxOkFnyEWXqIZuUB01EEVP6AW9oXfr2Xq1PqzxbHTFmu8coT+wvr4Bx1+j5A==</latexit>

CUDA & PyTorch use NCWH

Sebastian Raschka STAT 453: Intro to Deep Learning 54

Interpreting (the Hidden)
ConvNet Layers
1. What CNNs Can Do
2. Image Classi cation
3. Convolutional Neural Network Basics
4. Convolutional Filters and Weight-Sharing
5. Cross-Correlation vs Convolution
6. CNNs & Backpropagation
7. CNN Architectures
8. What a CNN Can See
9. CNNs in PyTorch

Sebastian Raschka STAT 453: Intro to Deep Learning 55

fi
What a CNN Can See
Simple example: vertical edge detector

(From classical computer vision

research)

Sebastian Raschka STAT 453: Intro to Deep Learning 56

What a CNN Can See
Simple example: vertical edge detector

Sebastian Raschka STAT 453: Intro to Deep Learning 57

What a CNN Can See
Simple example: horizontal edge detector

A CNN can learn whatever it nds

best based on optimizing the objective
(e.g., minimizing a particular loss
to achieve good classi cation accuracy)

Sebastian Raschka STAT 453: Intro to Deep Learning 58

fi
fi
rward pass and do not participate in back-
ral network samples a different architecture,
Main Breakthrough for CNNs:
reduces complex co-adaptations of neurons,
ar other neurons. It is, therefore, forced to
n with many different random subsets of the
AlexNet & ImageNet
ut multiply their outputs by 0.5, which is a
of the predictive distributions produced by

nt
nd
nt
In Figure 3: 96 convolutional kernels of size
er: 11⇥11⇥3 learned by the first convolutional
or layer on the 224⇥224⇥3 input images. The Figure 4: (Left) Eight ILSVRC-2010 test images and the five labels considered most probable by our model.
The correct label is written under each image, and the probability assigned to the correct label is also shown
top 48 kernels were learned on GPU 1 while with a red bar (if it happens to be in the top 5). (Right) Five ILSVRC-2010 test images in the first column. The
the bottom 48 kernels were learned on GPU remaining columns show the six training images that produce feature vectors in the last hidden layer with the
smallest Euclidean distance from the feature vector for the test image.
2. See Section 6.1 for details.
i
In the left panel of Figure 4 we qualitatively assess what the network has learned by computing its
top-5 predictions on eight test images. Notice that even off-center objects, such as the mite in the
D E top-left, can be recognized by the net. Most of the top-5 labels appear reasonable. For example,
ble, ✏ is the learning rate, and @w
@L
wi D
is only other types of cat are considered plausible labels for the leopard. In some cases (grille, cherry)
there is genuine ambiguity about the intended focus of the photograph.
he objective with respect to w, evaluated at G. E.
Krizhevsky, A., Sutskever, I., & Hinton, (2012). Imagenet
the network’s classification with thedeepfeatureconvolutional neural
i
Another way to probe visual knowledge is to consider activations induced
networks. In Advances in neural information processing
by an systems (pp.
image at the last, 4096-dimensional 1097-1105).
hidden layer. If two images produce feature activation
vectors with a small Euclidean separation, we can say that the higher levels of the neural network
consider them to be similar. Figure 4 shows five images from the test set and the six images from
ean Gaussian distribution with standard de- the training set that are most similar to each of them according to this measure. Notice that at the
pixel level, the retrieved training images are generally not close in L2 to the query images in the first
cond, fourth, and fifth convolutional layers, column. For example, the retrieved dogs and elephants appear in a variety of poses. We present the
e constant 1. This initialization accelerates
Sebastian Raschka
results for manySTAT 453:
more test Introinto
images theDeep Learning
supplementary material. 59
repeated in layers 2,3,4,5. The last two layers are fully connected, taking features from

What a CNN Can See

the top convolutional layer as input in vector form (6 · 6 · 256 = 9216 dimensions).
The final layer is a C-way softmax function, C being the number of classes. All filters
and feature maps are square in shape.
Which patterns from the training set activate the feature map?

Layer 1 Layer 2 Layer 3 Layer 4 Layer 5

Fig. 4. Evolution of a randomly chosen subset of model features through training.

Each layer’s features are displayed in a different block. Within each block, we show
a randomly chosen subset of features at epochs [1,2,5,10,20,30,40,64]. The visualiza-
tion shows the strongest activation (across all training examples) for a given feature
map, projected down to pixel space using our deconvnet approach. Color contrast is
artificially enhanced and the figure is best viewed in electronic form.
Zeiler, M. D., & Fergus, R. (2014, September). Visualizing and understanding convolutional
networks. European
occluderIncovers theconference on computer
image region vision (pp.
that appears 818-833).
in the Springer, Cham.
visualization, we see a
strongbackpropagate
Method: drop in activity
stronginactivation
the feature map.
signals This layers
in hidden showstothat the images,
the input visualization
then
genuinely
apply corresponds
"unpooling" tovalues
to map the the image
to the structure that
original pixel stimulates
space for that feature map,
hence validating the other visualizations shown in Fig. 4 and Fig. 2.
visualization

5 Experiments Sebastian Raschka STAT 453: Intro to Deep Learning 60

What a CNN Can See
Which patterns from the training set activate the feature map?
Zeiler, M. D., & Fergus, R. (2014, September). Visualizing and understanding convolutional
networks.
824 In European
M.D. Zeiler andconference
R. Ferguson computer vision (pp. 818-833). Springer, Cham.

Layer 1

Layer 2

Sebastian Raschka STAT 453: Intro to Deep Learning 61

What a CNN Can See
Which patterns from the training set activate the feature map?
Zeiler, M. D., & Fergus, R. (2014, September). Visualizing and understanding convolutional
networks. In European conference on computer vision (pp. 818-833). Springer, Cham.
Layer 2

Layer 3

Sebastian Raschka STAT 453: Intro to Deep Learning 62

What a CNN Can See
Which patterns from the training set activate the feature map?
Zeiler, M. D., & Fergus, R. (2014, September). Visualizing and understanding convolutional
networks. In European
Layer 3 conference on computer vision (pp. 818-833). Springer, Cham.

Layer 4 Layer 5
Fig. 2. Visualization of features in a fully trained model. For layers 2-5 we show the top
Sebastian Raschka STAT 453: Intro to Deep Learning 63
https://thegradient.pub/a-visual-history-of-interpretation-for-image-recognition/

Sebastian Raschka STAT 453: Intro to Deep Learning 64

Let's Implement CNNs in
PyTorch
1. What CNNs Can Do
2. Image Classi cation
3. Convolutional Neural Network Basics
4. Convolutional Filters and Weight-Sharing
5. Cross-Correlation vs Convolution
6. CNNs & Backpropagation
7. CNN Architectures
8. What a CNN Can See
9. CNNs in PyTorch

Sebastian Raschka STAT 453: Intro to Deep Learning 65

General Training Question Paper Test 10
57% (23)
General Training Question Paper Test 10
22 pages
Deep Learning For Computer Vision PDF
7% (14)
Deep Learning For Computer Vision PDF
24 pages
General Power of Attorney
No ratings yet
General Power of Attorney
2 pages
L12_intro-cnn-part1__slides
No ratings yet
L12_intro-cnn-part1__slides
56 pages
Week8 WEB
No ratings yet
Week8 WEB
54 pages
Introduction to Deep Learning 17th January 2025 (2)
No ratings yet
Introduction to Deep Learning 17th January 2025 (2)
60 pages
Introduction To Deep Convolutional Neural Networks: March 2016
No ratings yet
Introduction To Deep Convolutional Neural Networks: March 2016
51 pages
7 CNN
No ratings yet
7 CNN
66 pages
Convolutional Neural PDF
No ratings yet
Convolutional Neural PDF
187 pages
CNN For Deep Learning - Convolutional Neural Networks
No ratings yet
CNN For Deep Learning - Convolutional Neural Networks
10 pages
CVlecture 6
No ratings yet
CVlecture 6
33 pages
anthony
No ratings yet
anthony
33 pages
CNN Eem305
100% (1)
CNN Eem305
7 pages
Lecture 5
No ratings yet
Lecture 5
36 pages
Neural Network Notes
No ratings yet
Neural Network Notes
268 pages
L01-intro_slides
No ratings yet
L01-intro_slides
67 pages
80879v00 Deep Learning Ebook
No ratings yet
80879v00 Deep Learning Ebook
15 pages
Deep Convolutional Neural Networks For Image Classification: Many Slides From Rob Fergus (NYU and Facebook)
No ratings yet
Deep Convolutional Neural Networks For Image Classification: Many Slides From Rob Fergus (NYU and Facebook)
55 pages
Cnn
No ratings yet
Cnn
56 pages
Convolutional Neural Networks
100% (1)
Convolutional Neural Networks
31 pages
Guddu jha_organized
No ratings yet
Guddu jha_organized
3 pages
ch4_CNN
No ratings yet
ch4_CNN
35 pages
4b Image Processing
No ratings yet
4b Image Processing
63 pages
PEC CS 802C Deep Learning
No ratings yet
PEC CS 802C Deep Learning
13 pages
Research and Prospect of Image Recognition Based o
No ratings yet
Research and Prospect of Image Recognition Based o
7 pages
Dlincv 161110052148 PDF
No ratings yet
Dlincv 161110052148 PDF
271 pages
Lecture 08
No ratings yet
Lecture 08
43 pages
An Introduction to Convolutional Neural Networks_ a Comprehensive Guide to CNNs in Deep Learning _ DataCamp
No ratings yet
An Introduction to Convolutional Neural Networks_ a Comprehensive Guide to CNNs in Deep Learning _ DataCamp
14 pages
DL_Unit3_1 (1)
No ratings yet
DL_Unit3_1 (1)
67 pages
Deep Learning notes
No ratings yet
Deep Learning notes
155 pages
cnn-notes-unit-3-notes
No ratings yet
cnn-notes-unit-3-notes
17 pages
CNN Model For Image Classification Using Resnet: Dr. Senbagavalli M & Swetha Shekarappa G
No ratings yet
CNN Model For Image Classification Using Resnet: Dr. Senbagavalli M & Swetha Shekarappa G
10 pages
CV Ss16 0609 Deep Learning
No ratings yet
CV Ss16 0609 Deep Learning
91 pages
Deep Learning Applications
No ratings yet
Deep Learning Applications
309 pages
CERN Deep Learning and Vision
No ratings yet
CERN Deep Learning and Vision
72 pages
Convolutional Neural Networks Notes
No ratings yet
Convolutional Neural Networks Notes
29 pages
Introduction to Convolutional Neural Networks1-Unit3.docx
No ratings yet
Introduction to Convolutional Neural Networks1-Unit3.docx
10 pages
BMM 2018 - Deep Learning Tutorial
No ratings yet
BMM 2018 - Deep Learning Tutorial
47 pages
Introducing Deep Learning With MATLAB
No ratings yet
Introducing Deep Learning With MATLAB
15 pages
CNN 2
No ratings yet
CNN 2
47 pages
2019 6S191 L3 PDF
No ratings yet
2019 6S191 L3 PDF
71 pages
Introduction To Convolutional Neural Networks
No ratings yet
Introduction To Convolutional Neural Networks
41 pages
7 Applications of Convolutional Neural Networks - FWS
No ratings yet
7 Applications of Convolutional Neural Networks - FWS
3 pages
Matconvnet: Convolutional Neural Networks For Matlab
No ratings yet
Matconvnet: Convolutional Neural Networks For Matlab
55 pages
MA - Koelbl Memoire CNN
No ratings yet
MA - Koelbl Memoire CNN
79 pages
Vbook - Pub Deep Learning For Computer Visionpdf
No ratings yet
Vbook - Pub Deep Learning For Computer Visionpdf
24 pages
Harley MSC Thesis Menos Especializadpo
No ratings yet
Harley MSC Thesis Menos Especializadpo
71 pages
Deep Learning For Computer Vision PDF
No ratings yet
Deep Learning For Computer Vision PDF
24 pages
Lecture06 - Copie
No ratings yet
Lecture06 - Copie
52 pages
Deep Learning for Medical Image Analysis 1st Edition S. Kevin Zhou - Download the complete ebook in PDF format and read freely
100% (2)
Deep Learning for Medical Image Analysis 1st Edition S. Kevin Zhou - Download the complete ebook in PDF format and read freely
71 pages
Objectdetection
No ratings yet
Objectdetection
7 pages
Syllabus Udacity Default en Us
No ratings yet
Syllabus Udacity Default en Us
4 pages
Lecture 6
No ratings yet
Lecture 6
68 pages
Convolutional Neural Networks: CMSC 733 Fall 2015 Angjoo Kanazawa
No ratings yet
Convolutional Neural Networks: CMSC 733 Fall 2015 Angjoo Kanazawa
55 pages
Deep Learning: Alberto Ezpondaburu
No ratings yet
Deep Learning: Alberto Ezpondaburu
58 pages
An Overview of Convolutional Neural Network Architectures For Deep Learning
No ratings yet
An Overview of Convolutional Neural Network Architectures For Deep Learning
22 pages
C# Mastery: A Comprehensive Guide to Advanced C# Features and Applications
From Everand
C# Mastery: A Comprehensive Guide to Advanced C# Features and Applications
Lena Neill
No ratings yet
Station Research Aggregation Spreadsheet
No ratings yet
Station Research Aggregation Spreadsheet
38 pages
Bus Math Ni Gold 2 1
No ratings yet
Bus Math Ni Gold 2 1
10 pages
General Obligations Under FIU-InD Guidelines_Krypsm Private Limited
No ratings yet
General Obligations Under FIU-InD Guidelines_Krypsm Private Limited
22 pages
Alok - Annual Report 2010-11 - Low Rais
No ratings yet
Alok - Annual Report 2010-11 - Low Rais
200 pages
Smart Hot Chamber in Refrigeration System Based
No ratings yet
Smart Hot Chamber in Refrigeration System Based
5 pages
Reductor Excentrico y Concetrico
No ratings yet
Reductor Excentrico y Concetrico
1 page
Delta Construction PST Case V8.unlocked
No ratings yet
Delta Construction PST Case V8.unlocked
22 pages
Mastering Nursing Academics - Unlocking The Path To Success With Expert Writing Support
No ratings yet
Mastering Nursing Academics - Unlocking The Path To Success With Expert Writing Support
3 pages
Wild Swans Thesis Statement
100% (3)
Wild Swans Thesis Statement
6 pages
Trading List PDF
No ratings yet
Trading List PDF
113 pages
Cdo SS 0 PDF
100% (2)
Cdo SS 0 PDF
80 pages
Machine Learning and Hadoop
No ratings yet
Machine Learning and Hadoop
26 pages
ASTM-D6238-98-2017-
No ratings yet
ASTM-D6238-98-2017-
3 pages
Appendix Table 7 - Upper Critical Values of Chi-Square Distribution
No ratings yet
Appendix Table 7 - Upper Critical Values of Chi-Square Distribution
2 pages
Unit3 Harsh Awasthi
No ratings yet
Unit3 Harsh Awasthi
169 pages
Brochure Technical Analysis-Chart Patterns-Capital Markets
No ratings yet
Brochure Technical Analysis-Chart Patterns-Capital Markets
19 pages
Hyderabad Heritage Legislations
No ratings yet
Hyderabad Heritage Legislations
7 pages
Thesis On Gwadar Port
100% (3)
Thesis On Gwadar Port
4 pages
Sa Re Ga Ma Pa Little Champ Nepal
No ratings yet
Sa Re Ga Ma Pa Little Champ Nepal
11 pages
Theory and Experiment On Microstrip Antennas
No ratings yet
Theory and Experiment On Microstrip Antennas
9 pages
Von Thunen Theory of Agricultural Location UPSC Notes
No ratings yet
Von Thunen Theory of Agricultural Location UPSC Notes
11 pages
Overview of Hydropower Potential and Energy Sector in Myanmar
100% (1)
Overview of Hydropower Potential and Energy Sector in Myanmar
27 pages
Saline and Alkali Soil
No ratings yet
Saline and Alkali Soil
19 pages
2024-Owenomics-Dont Blame Indexing
No ratings yet
2024-Owenomics-Dont Blame Indexing
6 pages
Research2 q4 Mod1.2 Constructingscienceprojectdisplayboard v2
100% (1)
Research2 q4 Mod1.2 Constructingscienceprojectdisplayboard v2
22 pages
Harga Cat
No ratings yet
Harga Cat
2 pages
Condenser Calculator
No ratings yet
Condenser Calculator
17 pages
Hall-2002-Unnatural Gender in Hindi PDF
No ratings yet
Hall-2002-Unnatural Gender in Hindi PDF
30 pages