0% found this document useful (0 votes)
21 views18 pages

Deep Learning Concepts and Techniques

The document contains a series of questions and answers related to deep learning concepts, including its definition, advantages, training methods, and various types of neural networks. Key topics covered include the function of layers in neural networks, activation functions, optimization techniques, and the significance of overfitting and underfitting. The document serves as a comprehensive quiz or study guide on deep learning fundamentals.

Uploaded by

marvelelegbede
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views18 pages

Deep Learning Concepts and Techniques

The document contains a series of questions and answers related to deep learning concepts, including its definition, advantages, training methods, and various types of neural networks. Key topics covered include the function of layers in neural networks, activation functions, optimization techniques, and the significance of overfitting and underfitting. The document serves as a comprehensive quiz or study guide on deep learning fundamentals.

Uploaded by

marvelelegbede
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Question 1x. what is Deep Learning a subfield of?

A. Computer Vision

B. Machine Learning

C. Data Science

D. Artificial Intelligence

ANSWER: B

Question 2x. What does Deep Learning algorithms attempt to learn by using a hierarchy of
multiple layers?

A. Prediction rules

B. Multiple levels of representation

C. Optimization functions

D. Classification boundaries

ANSWER: B

Question 3x. Compared to manually designed features, what is a key advantage of learned
features in Deep Learning?

A. They are more over-specified

B. They take longer to design

C. They are easier to adapt

D. They are more incomplete

ANSWER: C

Question 4x. Around what year did Deep Learning start outperforming other machine
learning techniques in speech and vision?

A. 2010

B. 2015

C. 2005

D. 2000
ANSWER: A

Question 5x. According to the "short answer" in the slides, what does "Deep Learning"
specifically mean?

A. Using a neural network with several layers between input and output

B. Using a neural network with one hidden layer

C. Using unsupervised clustering algorithms

D. Using reinforcement learning with deep Q-networks

ANSWER: A

Question 6x. What is the described function of the series of layers between input and output
in a deep neural network?

A. Data storage and retrieval

B. Feature identification and processing in stages

C. Parallel computation distribution

D. Input normalization and scaling

ANSWER: B

Question 7x. What was the historical problem with training neural networks with many
hidden layers?

A. There wasn't enough training data available

B. The available hardware couldn't handle the computations

C. The weight-learning algorithms didn't work well on multi-layer architectures

D. The algorithms were too fast and unstable

ANSWER: C

Question 8x. In the described new way to train multi-layer neural networks, what is the first
step?

A. Train all layers simultaneously


B. Train the first layer

C. Train the middle layer

D. Train the final output layer

ANSWER: B

Question 9x. According to the dl, what is each non-output layer trained to be in the new
training approach?

A. A classifier

B. An auto-encoder

C. A predictor

D. A generator

ANSWER: B

Question 10x. What is an auto-encoder trained to do?

A. Classify inputs into categories

B. Predict future values in a sequence

C. Generate new synthetic data

D. Reproduce the input

ANSWER: D

Question 11x. How does forcing an auto-encoder to reproduce input with fewer units than
the inputs affect the hidden layer?

A. It becomes an output predictor

B. It becomes a random noise generator

C. It becomes a feature detector

D. It becomes a data storage unit

ANSWER: C

Question 12x. What is the final layer trained to do in the described deep learning approach?
A. Store training data

B. Reproduce the original input

C. Act as another auto-encoder

D. Predict class based on outputs from previous layers

ANSWER: D

Question 13x. According to the dl, why are many-layer neural network architectures capable
of generalizing well?

A. They require less training data

B. They run faster

C. They use less memory

D. They can learn true underlying features and "feature logic"

ANSWER: D

Question 14x. What aspect of deep learning is described as a "very fast growing area"?

A. The cost of implementation

B. The hardware requirements

C. The types of autoencoders, architectural variations, and training algorithms

D. The number of available programmers

ANSWER: C

Question 15x. The slides state that "AI is the new Electricity." What is the analogy being
made?

A. Both are expensive to implement

B. Both will transform numerous industries

C. Both require specialized education

D. Both are natural resources

ANSWER: B
Question 16x. What is the purpose of activation functions in neural networks according to
the slides?

A. To normalize input data

B. To introduce non-linearity and allow learning of complex mappings

C. To reduce memory usage

D. To increase computational speed

ANSWER: B

Question 17x. Which activation function is specifically mentioned as "squashing input


between -1 and 1"?

A. Softmax

B. ReLU

C. Tanh

D. Sigmoid

ANSWER: C

Question 18x. What problem does Leaky ReLU address?

A. Slow convergence

B. The dead ReLU problem

C. Vanishing gradients

D. Exploding gradients

ANSWER: B

Question 19x. In forward propagation, what is the basic computation for a neuron?

A. y = f(Wx + b)

B. y = f(x)

C. y = x + b

D. y = Wx

ANSWER: A
Question 20x. What is the primary role of the loss function?

A. To measure the difference between prediction and true value

B. To normalize activations

C. To select the best features

D. To update weights

ANSWER: A

Question 21x. A neural network consists of layers of neurons. Each neuron computes?

A. y = Wx

B. y = f(Wx + b)

C. y = x + b

D. y = f(x)

ANSWER: B

Question 22x. What are the three main layers in a basic neural network?

A. Input, Convolutional, Output

B. Input, Hidden, Output

C. Hidden, Dense, Output

D. Input, Recurrent, Output

ANSWER: B

Question 23x. What type of layer learns intermediate representations?

A. Input

B. Hidden

C. Output

D. Normalization

ANSWER: B
Question 24x. What type of layer produces the final prediction?

A. Hidden

B. Input

C. Output

D. Dropout

ANSWER: C

Question 25x. What do weights measure in a neural network?

A. Learning speed

B. Connection strength

C. Error rate

D. Activation output

ANSWER: B

Question 26x. What do biases provide in a neural network?

A. Noise

B. Flexibility

C. Regularization

D. Normalization

ANSWER: B

Question 27x. What process passes input data through the network layer by layer?

A. Back propagation

B. Optimization

C. Forward propagation

D. Regularization

ANSWER: C
Question 28x. What algorithm computes gradients of the loss function with respect to
weights?

A. Optimization

B. Back propagation

C. Forward pass

D. Dropout

ANSWER: B

Question 29x. What is the process of updating weights to minimize loss?

A. Feature extraction

B. Optimization

C. Normalization

D. Inference

ANSWER: B

Question 30x. What does the learning rate control?

A. Number of neurons

B. Weight update size

C. Dataset size

D. Model depth

ANSWER: B

Question 31x. What does an epoch represent?

A. One parameter update

B. One batch

C. One full pass over the dataset

D. One neuron activation


ANSWER: C

Question 32x. What is overfitting?

A. Poor learning

B. Memorization of training data

C. Fast convergence

D. Under-optimization

ANSWER: B

Question 33x. What is underfitting?

A. Excessive memorization

B. Poor learning

C. High variance

D. Data leakage

ANSWER: B

Question 34x. What technique randomly drops neurons during training?

A. Normalization

B. Dropout

C. Pooling

D. Attention

ANSWER: B

Question 35x. What type of neural network extracts spatial features?

A. RNN

B. CNN

C. MLP

D. Transformer
ANSWER: B

Question 36x. What type of neural network captures temporal sequence dependencies?

A. CNN

B. RNN

C. MLP

D. Autoencoder

ANSWER: B

Question 37x. What mechanism is the core of Transformers?

A. Pooling

B. Convolution

C. Attention

D. Recurrence

ANSWER: C

Question 38x. What are autoencoders mainly used for?

A. Classification

B. Representation learning

C. Reinforcement learning

D. Optimization

ANSWER: B

Question 39x. What type of models generate new data?

A. Discriminative models

B. Generative models

C. Linear models

D. Deterministic models
ANSWER: B

Question 40x. What is transfer learning?

A. Training from scratch

B. Fine-tuning pre-trained models

C. Removing layers

D. Feature normalization

ANSWER: B

Question 41x. What metric is commonly used for classification?

A. MAE

B. MSE

C. Accuracy

D. PSNR

ANSWER: C

Question 42x. What metric is commonly used for regression?

A. Accuracy

B. F1-score

C. MSE

D. SSIM

ANSWER: C

Question 43x. What are hyperparameters?

A. Learned parameters

B. Settings chosen before training

C. Gradients

D. Activations
ANSWER: B

Question 44x. What problem occurs when gradients become extremely small?

A. Exploding gradients

B. Vanishing gradients

C. Overfitting

D. Underfitting

ANSWER: B

Question 45x. What technique normalizes activations per batch?

A. Dropout

B. Batch normalization

C. Pooling

D. Attention

ANSWER: B

Question 46x. What is the workflow step after training?

A. Data collection

B. Validation

C. Deployment

D. Preprocessing

ANSWER: B

Question 47x. What type of network has no loops?

A. Recurrent

B. Feedforward

C. Residual

D. Cyclic
ANSWER: B

Question 48x. Information in a feedforward network moves?

A. Backward only

B. In cycles

C. Forward only

D. Randomly

ANSWER: C

Question 49x. One artificial neuron can only realize?

A. A nonlinear function

B. A linear function

C. A probabilistic function

D. A recursive function

ANSWER: B

Question 50x. Many layers of neurons can train?

A. Only linear mappings

B. Arbitrarily complex functions

C. Random functions

D. Fixed rules

ANSWER: B

Question 51x. What type of layer connects every input neuron to every output neuron?

A. Convolutional

B. Recurrent

C. Fully Connected

D. Normalization
ANSWER: C

Question 52x. What type of layer stabilizes learning by normalizing activations?

A. Dropout

B. Fully Connected

C. Normalization

D. Convolutional

ANSWER: C

Question 53x. What type of layer randomly drops neurons during training?

A. Pooling

B. Dropout

C. Normalization

D. Attention

ANSWER: B

Question 54x. What do weights and biases have in common during training?

A. They are fixed

B. They are randomly dropped

C. They are learned

D. They are ignored

ANSWER: C

Question 55x. What allows a neural network to learn complex mappings?

A. Loss functions

B. Linear neurons

C. Non-linearity

D. Batch size
ANSWER: C

Question 56x. Which activation function is used in classification output layers?

A. ReLU

B. Tanh

C. Sigmoid

D. Softmax

ANSWER: D

Question 57x. What does back propagation compute gradients with respect to?

A. Inputs only

B. Activations only

C. Weights and biases

D. Loss only

ANSWER: C

Question 58x. What rule is used in back propagation to compute gradients?

A. Bayes rule

B. Product rule

C. Chain rule

D. Sum rule

ANSWER: C

Question 59x. What process updates weights using computed gradients to minimize loss?

A. Forward propagation

B. Feature extraction

C. Optimization

D. Normalization
ANSWER: C

Question 60x. Which optimization method is explicitly mentioned?

A. AdaBoost

B. SGD

C. K-Means

D. PCA

ANSWER: B

Question 61x. Which optimization method adapts learning rates per parameter?

A. SGD

B. Momentum

C. RMSProp

D. Batch Normalization

ANSWER: C

Question 62x. Which optimization method combines momentum and adaptive learning
rates?

A. SGD

B. RMSProp

C. Adam

D. AdaBoost

ANSWER: C

Question 63x. What happens if the learning rate is too high?

A. Slow convergence

B. Stable training

C. Unstable training
D. Better generalization

ANSWER: C

Question 64x. What happens if the learning rate is too low?

A. Fast convergence

B. Unstable gradients

C. Slow convergence

D. Overfitting

ANSWER: C

Question 65x. What does a batch represent?

A. One full pass over data

B. A subset of training data

C. One gradient

D. One neuron

ANSWER: B

Question 66x. What does an iteration represent?

A. One epoch

B. One batch

C. One parameter update

D. One dataset

ANSWER: C

Question 67x. What is one solution to overfitting mentioned?

A. Increasing depth

B. Removing loss

C. Regularization
D. Increasing learning rate

ANSWER: C

Question 68x. What type of regularization uses L1 and L2 penalties?

A. Data augmentation

B. Dropout

C. Weight regularization

D. Early stopping

ANSWER: C

Question 69x. What neural network architecture extracts spatial features using convolution
and pooling?

A. RNN

B. CNN

C. MLP

D. Transformer

ANSWER: B

Question 70x. What neural network variants handle long-term dependencies?

A. CNN

B. Autoencoders

C. LSTM and GRU

D. Perceptrons

ANSWER: C

You might also like