100% found this document useful (2 votes)
1K views

DL - Assignment 8 Solution

This document contains a 10 question multiple choice quiz about deep learning concepts such as convolutional neural networks, transfer learning, and vanishing gradients. The questions cover topics like the effects of subtracting a constant from the input to the softmax layer, the purpose and effects of pooling layers, identifying low-level versus high-level features in a CNN, and advantages of the ReLU activation function over sigmoid and tanh. The document provides the questions, multiple choice answers, and detailed solutions to explain the concepts tested in each question.

Uploaded by

swathisreejith6
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
1K views

DL - Assignment 8 Solution

This document contains a 10 question multiple choice quiz about deep learning concepts such as convolutional neural networks, transfer learning, and vanishing gradients. The questions cover topics like the effects of subtracting a constant from the input to the softmax layer, the purpose and effects of pooling layers, identifying low-level versus high-level features in a CNN, and advantages of the ReLU activation function over sigmoid and tanh. The document provides the questions, multiple choice answers, and detailed solutions to explain the concepts tested in each question.

Uploaded by

swathisreejith6
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

NPTEL Online Certification Courses

Indian Institute of Technology Kharagpur

Deep Learning
Assignment- Week 8
TYPE OF QUESTION: MCQ/MSQ
Number of questions: 10 Total mark: 10 X 1 = 10
______________________________________________________________________________

QUESTION 1:
A SoftMax layer is applied to an activation vector 𝒙 ∈ 𝑹𝒏 :

𝒚 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝒙)

If a constant vector 𝒄𝟏𝒏 got subtracted from 𝒙 ;

( 𝒄 ∈ 𝑹 is a scalar, 𝟏𝒏 ∈ 𝑹𝒏 is an “n” component vector in 𝑹𝒏 of all 1s)

̂ is :
, such that new output 𝒚

̂ = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝒙 − 𝒄𝟏𝒏 )
𝒚

Select the right option

a. 𝑲𝑳𝑫𝒊𝒗𝒆𝒓𝒈𝒆𝒏𝒄𝒆(𝒚|𝒚̂) = 𝟎
b. 𝑲𝑳𝑫𝒊𝒗𝒆𝒓𝒈𝒆𝒏𝒄𝒆(𝒚|𝒚̂) = 𝟏
c. 𝑲𝑳𝑫𝒊𝒗𝒆𝒓𝒈𝒆𝒏𝒄𝒆(𝒚|𝒚̂) = ∞
̂ 𝒄𝒂𝒏𝒏𝒐𝒕 𝒃𝒆 𝒄𝒐𝒎𝒑𝒂𝒓𝒆𝒅
d. 𝒚 𝒂𝒏𝒅 𝒚
Correct Answer: a
Detailed Solution:

SoftMax function is translation invariant, 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝒙 − 𝒄𝟏𝒏 ) = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝒙)


______________________________________________________________________________

QUESTION 2:
An RGB input image has been converted into a matrix of size 257 X 257 X 3 and a kernel/filter of
size 7 X 7 X 3 with a stride of 2 and padding = 3 is used for 2D convolution. What will be the size
of the output of convolution?

a. 129x129x1
b. 128x128x1
c. 254x254x3
d. 256x256x1
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur

Correct Answer: a
Detailed Solution:

The size of the convoluted matrix is given by CxC where C=((I-F+2P)/S)+1, where C is the
size of the Convoluted matrix, I is the size of the input matrix, F the size of the filter matrix
and P the padding applied to the input matrix. Here P=3, I=257, F=7 and S=2. There the
answer is 129x129x1.
______________________________________________________________________________

QUESTION 3:
Primary reason for adding pooling layers is?

a. Promote small shift invariance


b. Reduce computations for subsequent layers
c. To produce activations that summarize filter response in local windows.
d. Both b and c

Correct Answer: d
Detailed Solution:

Pooling layer reduces compute requirements for subsequent layers and locally summarize
response in its input. In contrast to earlier wisdom, Recent research has indicated that
pooling makes images less robust to small spatial shifts (Refer Blurpool).

______________________________________________________________________________

QUESTION 4:
The figure below shows image of a face which is input to a convolutional neural net and the
other three images shows different levels of features extracted from the network. Can you
identify from the following options which one is correct?

a. Label 3: Low-level features, Label 2: High-level features, Label 1: Mid-level


features
b. Label 1: Low-level features, Label 3: High-level features, Label 2: Mid-level
features
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur

c. Label 2: Low-level features, Label 1: High-level features, Label 3: Mid-level


features
d. Label 3: Low-level features, Label 1: High-level features, Label 2: Mid-level
features

Correct Answer: b
Detailed Solution:

Convolutional NN will try to learn low-level features such as edges and lines in early layers
then parts of faces of people and then high-level representation of a face.

______________________________________________________________________________

QUESTION 5:
Suppose you have 8 convolutional kernel of size 5 x 5 with no padding and stride 1 in the first
layer of a convolutional neural network. You pass an input of dimension 228 x 228 x 3 through
this layer. What are the dimensions of the data which the next layer will receive?

a. 224 x 224 x 3
b. 224 x 224 x 8
c. 226 x 226 x 8
d. 225 x 225 x 3

Correct Answer: b
Detailed Solution:

The layer accepts a volume of size W1×H1×D1. In our case, 228x228x3

Requires four hyperparameters: Number of filters K=8, their spatial extent F=5, the stride
S=1, the amount of padding P=0.

Produces a volume of size W2×H2×D2 i.e. 224x224x8 where: W2=(W1−F+2P)/S+1


=(228−5)/1+1 =224, H2=(H1−F+2P)/S+1 =(228−5)/1+1 =224, (i.e. width and height are
computed equally by symmetry), D2= Number of filters K=8.

____________________________________________________________________________

QUESTION 6:
Choose the correct statement in context of transfer learning

a. Higher layers learn task specific features, where as lower layers learn general
features
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur

b. Transfer learning is generally used in situations when task specific dataset is very
less
c. The weights of lower layers of pretrained CNN (trained on dataset like ImageNet
etc) are copied and higher layers are random/gaussian initialized and entire
network is finetuned on smaller dataset
d. All of the above

Correct Answer: d

Detailed Solution:

Lower layers are more general features (for eg: can be edge detectors) and thus can be
transferred well to other task. Higher layers on the other hand are task specific. Transfer
learning is used in data scarce situations.

____________________________________________________________________________

QUESTION 7:
Advantage of ReLU over Sigmoid and TanH is

a. Low computational requirements


b. Alleviates vanishing gradient to some extent
c. Backpropagation is simpler
d. All of above

Correct Answer: d
Detailed Solution:

ReLU doesn’t saturates, is easier to compute and has simpler formulation of gradient with
respect to its input.

______________________________________________________________________________

QUESTION 8:
Statement 1: For a transfer learning task, lower layers are more generally transferred to
another task

Statement 2: For a transfer learning task, last few layers are more generally transferred to
another task

Which of the following option is correct?


NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur

a. Statement 1 is correct and Statement 2 is incorrect


b. Statement 1 is incorrect and Statement 2 is correct
c. Both Statement 1 and Statement 2 are correct
d. Both Statement 1 and Statement 2 are incorrect

Correct Answer: a

Detailed Solution:

Lower layers are more general features (for eg: can be edge detectors) and thus can be
transferred well to other task. Higher layers on the other hand are task specific.

______________________________________________________________________________

QUESTION 9:
Statement 1: Adding more hidden layers will solve the vanishing gradient problem for a 2-layer
neural network

Statement 2: Making the network deeper will increase the chance of vanishing gradients.

a. Statement 1 is correct
b. Statement 2 is correct
c. Neither Statement 1 nor Statement 2 is correct
d. Vanishing gradient problem is independent of number of hidden layers of the
neural network.

Correct Answer: b
Detailed Solution:

As more layers using certain activation functions are added to neural networks, the
gradients of the loss function approaches zero, making the network hard to train. Thus
statement 2 is correct.

____________________________________________________________________________

QUESTION 10:
How many convolution layers are there in a LeNet-5 architecture?

a. 2
b. 3
c. 4
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur

d. 5

Correct Answer: a
Detailed Solution:

There are two convolutional layers and three fully connected layers in LeNet-5
architecture.

______________________________________________________________________

______________________________________________________________________________

************END*******

You might also like