0% found this document useful (0 votes)

140 views14 pages

Deep Learning Revision Guide

The document covers various modules related to deep learning, including neural networks, loss functions, learning paradigms, and regularization techniques. It provides quick revision notes and multiple-choice questions (MCQs) for each module to test understanding of key concepts. Topics include supervised and unsupervised learning, optimization methods, and recent trends in deep learning research.

Uploaded by

sandip08.dev

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

140 views14 pages

Deep Learning Revision Guide

Uploaded by

sandip08.dev

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Revision Topics Module 3: Training Neural Networks

Quick Revision:
Module 1: Introduction to Deep Learning
• Loss Functions: MSE, Cross-Entropy
Quick Revision Notes:
• Backpropagation: Gradient descent + chain rule
• Learning Paradigms:
• Regularization: L1, L2, Dropout
o Supervised: Data + labels (e.g., classification).
• Optimization: SGD, Adam, RMSProp
o Unsupervised: No labels (e.g., clustering, dimensionality reduction).
• Model Selection: Choosing best architecture/hyperparams
o Reinforcement: Agent learns via rewards.
Module 4: Conditional Random Fields
• Perspectives & Issues in Deep Learning:
Quick Revision:
o DL allows hierarchical feature learning.
• CRFs: Discriminative models used for structured prediction
o Needs huge data, high computation, prone to overfitting.
• Linear Chain CRF: For sequence data
o Interpretability is low; generalization and convergence are issues.
• Partition Function (Z): Normalizes probabilities
• Fundamental Learning Techniques:
• Belief Propagation: Inference technique
o Linear regression, logistic regression, decision trees, SVM, etc.
• HMM vs CRF: HMM is generative, CRF is discriminative
o DL builds upon these using neural network layers.
• Entropy: Measure of uncertainty
Module 2: Feedforward Neural Networks

Quick Revision:

• ANN (Artificial Neural Network): Composed of neurons (nodes),

connected with weights.

• Activation Functions: Sigmoid, Tanh, ReLU, Leaky ReLU, Softmax.

• Multi-layer NN: Multiple hidden layers enable deep learning.

• Fuzzy Relations: Deals with uncertainty and imprecision.

• Cardinality: Number of elements in a fuzzy set.

Module 5: Deep Learning Module 6: Deep Learning Research

Quick Revision: Quick Revision:

• Deep Feedforward Networks: Neural networks with multiple layers where • Recent Trends: Transformers, Diffusion Models, Vision Transformers
data flows in one direction. (ViTs), GNNs.

• Regularization Techniques: Methods like L1, L2, and dropout to prevent • Self-Supervised Learning (SSL): Learning representations from unlabeled
overfitting. data.

• Training Deep Models: Involves techniques like batch normalization and • Ethics in DL: Bias, fairness, explainability, energy usage.
careful weight initialization.
• Zero-Shot / Few-Shot Learning: Model performs tasks it wasn’t
• Dropout: A regularization method where random neurons are ignored during explicitly trained on.
training to prevent overfitting.
• Foundation Models: Large models like GPT, BERT, CLIP, trained on massive
• Convolutional Neural Networks (CNNs): Specialized for processing grid- data.
like data such as images.
• Transfer & Multitask Learning: Sharing knowledge across tasks/domains.
• Recurrent Neural Networks (RNNs): Designed for sequential data,
• Model Compression: Pruning, quantization, knowledge distillation.
capturing temporal dependencies.
• Explainability Tools: SHAP, LIME, saliency maps.
• Deep Belief Networks (DBNs): Composed of multiple layers of stochastic,
latent variables.
MCQs: Module 1 (Total: 20) 6. Which paradigm fits clustering problems?
A) Supervised
1. Which of the following is a supervised learning algorithm? B) Unsupervised
A) K-means C) Reinforcement
B) PCA D) Semi-supervised
C) Decision Tree Answer: B
D) Autoencoder
Answer: C
Explanation: Decision Trees need labeled data.
7. The main challenge in training deep networks is:
A) High bias
B) Underfitting
2. Unsupervised learning is best suited for: C) Vanishing gradients
A) Sentiment classification D) Fast convergence
B) Stock price prediction Answer: C
C) Dimensionality reduction
D) Email spam detection
Answer: C
8. Which learning type is used in ChatGPT?
A) Supervised
B) Unsupervised
3. Reinforcement learning involves: C) Reinforcement
A) Data with labels D) Both A & C
B) Learning by rewards Answer: D
C) Learning without feedback Explanation: Uses RLHF (Reinforcement Learning from Human Feedback) and supervised fine-
D) Pre-labeled clusters tuning.
Answer: B

9. Generalization in ML refers to:

4. Which one is not a fundamental issue in deep learning? A) Fitting the training data exactly
A) Overfitting B) Performing well on test/unseen data
B) Data scarcity C) Overfitting the model
C) Low computation cost D) Avoiding any learning
D) Interpretability Answer: B
Answer: C

10. Which of the following is not a machine learning paradigm?

5. Which of the following is a key feature of deep learning models? A) Supervised
A) Shallow layers B) Controlled
B) Manual feature extraction C) Reinforcement
C) Hierarchical feature learning D) Unsupervised
D) Low data dependency Answer: B
Answer: C
11. A key limitation of traditional ML compared to DL is: 16. Which of the following is NOT a traditional ML technique?
A) Accuracy A) SVM
B) Human-designed features B) Decision Tree
C) Data storage C) CNN
D) Output speed D) Naive Bayes
Answer: B Answer: C

12. A neuron in a neural network computes: 17. One major advantage of DL over ML:
A) Only output A) Smaller models
B) Weighted sum + activation B) Less training time
C) Gradient only C) Feature engineering is automated
D) Bias only D) More memory needed
Answer: B Answer: C

13. What is the bias in neural networks? 18. An AI system that plays chess using rewards is using:
A) Irrelevant data A) Supervised Learning
B) Constant added to weighted input B) Unsupervised Learning
C) Noise in output C) Reinforcement Learning
D) Penalty for error D) Regression
Answer: B Answer: C

14. Deep learning often suffers from: 19. "Epoch" in deep learning refers to:
A) Data explosion A) Time to build the model
B) Interpretability issues B) Full pass over training data
C) Low accuracy C) Layer depth
D) Constant learning rate D) Regularization rate
Answer: B Answer: B

15. Overfitting occurs when: 20. Interpretability in DL is low because:

A) Model is too simple A) It's too accurate
B) Model is too complex for data B) It uses simpler models
C) Too little training C) It has many complex layers
D) No bias D) It’s easy to visualize
Answer: B Answer: C
MCQs: Module 2 (20 Questions) 6. Cardinality in fuzzy sets refers to:
A) Degree of fuzziness
1. What does a neuron in ANN compute? B) Total elements
A) Gradient C) Number of neurons
B) Activation function only D) Shape of fuzzy graph
C) Weighted sum + activation Answer: B
D) Data labels
Answer: C

7. A perceptron is:
A) Linear classifier
2. The purpose of an activation function is to: B) Deep network
A) Increase training time C) Optimization algorithm
B) Normalize input D) Error correction mechanism
C) Introduce non-linearity Answer: A
D) Reduce accuracy
Answer: C

8. A multilayer perceptron is called deep if:

A) It has over 100 neurons
3. Which of these is not an activation function? B) It uses sigmoid
A) Sigmoid C) It has more than 1 hidden layer
B) ReLU D) It uses unsupervised learning
C) Softmax Answer: C
D) SVM
Answer: D

9. In a neural network, weights are:

A) Static values
4. Which activation function is used for binary classification? B) Updated during training
A) ReLU C) Random noise
B) Sigmoid D) Biases
C) Softmax Answer: B
D) Tanh
Answer: B

10. Tanh activation function ranges from:

A) 0 to 1
5. ReLU returns: B) -1 to 1
A) Only 1 or -1 C) -∞ to ∞
B) Linear values D) 0 to ∞
C) Max(0, x) Answer: B
D) Negative values only
Answer: C
11. A Softmax function is used in: 16. Output of sigmoid function is:
A) Regression A) Discrete
B) Binary classification B) Probabilistic
C) Multi-class classification C) Deterministic
D) Clustering D) Undefined
Answer: C Answer: B

12. Which is NOT a property of fuzzy relations? 17. In fuzzy logic, membership value lies between:
A) Reflexivity A) -1 and 1
B) Transitivity B) 0 and 10
C) Symmetry C) 0 and 1
D) Supervised learning D) 1 and 100
Answer: D Answer: C

13. What is the primary challenge with sigmoid? 18. The number of output neurons in classification = ?
A) Always 1 A) 1
B) Not differentiable B) Number of classes
C) Vanishing gradient C) Number of samples
D) Exploding output D) Infinite
Answer: C Answer: B

14. Multi-layer neural networks solve which major problem of perceptrons? 19. ANN is inspired by:
A) Random output A) Animal fur
B) Linearly inseparable data B) Electrical circuits
C) Speed C) Human brain
D) Cost D) DNA
Answer: B Answer: C

15. Leaky ReLU helps with: 20. The final layer of a binary classifier typically uses:
A) Bias issues A) ReLU
B) Overfitting B) Softmax
C) Dying ReLU problem C) Sigmoid
D) Vanishing gradient D) Linear
Answer: C Answer: C
MCQs: Module 3 (20 Questions) 6. L2 regularization is also known as:
A) Ridge
1. Which is a loss function used for regression? B) Lasso
A) Cross-entropy C) Dropout
B) Hinge loss D) Bias
C) Mean Squared Error Answer: A
D) Negative log-likelihood
Answer: C

7. Dropout helps by:

A) Increasing loss
2. Cross-entropy loss is used in: B) Reducing variance
A) Clustering C) Increasing gradient
B) Regression D) Stopping backprop
C) Binary/Multi-class classification Answer: B
D) Reinforcement learning
Answer: C

8. Gradient descent works by:

A) Increasing error
3. Backpropagation uses: B) Maximizing loss
A) Newton's law C) Minimizing loss
B) Chain rule D) Random updates
C) Euler's method Answer: C
D) L'Hospital's Rule
Answer: B

9. Adam optimizer uses:

A) Momentum + RMS
4. Regularization is used to: B) Only gradient
A) Increase training error C) Bias correction only
B) Overfit model D) L2 norm only
C) Reduce overfitting Answer: A
D) Improve underfitting
Answer: C

10. Model selection involves:

A) Data labeling
5. L1 regularization promotes: B) Choosing optimizers
A) Complex models C) Tuning architecture and hyperparameters
B) Larger weights D) Data cleaning
C) Sparsity Answer: C
D) Overfitting
Answer: C
11. Backpropagation adjusts: 16. Validation loss helps to detect:
A) Output A) Training convergence
B) Weights and biases B) Overfitting
C) Epochs C) Input noise
D) Inputs D) Batch size
Answer: B Answer: B

12. Risk minimization refers to: 17. Hyperparameters include:

A) Increasing training size A) Bias
B) Regularization B) Weights
C) Reducing expected loss C) Learning rate, batch size
D) More epochs D) Activation outputs
Answer: C Answer: C

13. Training loss is: 18. Learning rate too high causes:
A) Always zero A) Fast convergence
B) Only for test data B) Better accuracy
C) Computed on training data C) Overshooting minimum
D) Ignored D) Underfitting
Answer: C Answer: C

14. Overfitting happens when: 19. Which optimizer adapts learning rate per parameter?
A) Model is simple A) SGD
B) Model memorizes training data B) RMSProp
C) Model ignores training data C) Gradient Boost
D) Gradient is large D) Naive Bayes
Answer: B Answer: B

15. Epoch refers to: 20. Backpropagation stops at:

A) One iteration A) Output layer
B) One complete training pass B) Input layer
C) Only one batch C) Mid layer
D) Loss value D) Randomly
Answer: B Answer: B
MCQs: Module 4 (15 Questions)

1. CRF is a: 6. Entropy is defined as:

A) Generative model A) Measure of certainty
B) Discriminative model B) Data redundancy
C) Reinforcement model C) Uncertainty in distribution
D) Unsupervised model D) Gradient
Answer: B Answer: C

2. Linear chain CRF is suitable for: 7. CRFs are best used in:
A) Tabular data A) Image classification
B) Images B) Reinforcement learning
C) Sequence labeling C) Named Entity Recognition
D) Clustering D) Clustering
Answer: C Answer: C

3. Partition function in CRF is used to: 8. Training CRFs involves maximizing:

A) Update weights A) Joint likelihood
B) Normalize probability B) Posterior probability
C) Calculate entropy C) Entropy
D) Define classes D) Partition loss
Answer: B Answer: B

4. Difference between HMM and CRF: 9. Hidden states in HMM are:

A) CRF models joint probability A) Directly observable
B) HMM uses hidden states B) Learned through gradient descent
C) HMM is discriminative C) Latent
D) CRF is generative D) Always constant
Answer: B Answer: C

5. Which is an inference technique in CRFs? 10. Which models rely on partition functions?
A) Dropout A) Linear regression
B) Forward pass B) CRFs
C) Belief propagation C) KNN
D) Stochastic sampling D) Decision Trees
Answer: C Answer: B
11. Which of the following is a key limitation of HMMs? MCQs: Module 5 (20 Questions)
A) Scalability
B) Inability to handle overlapping features 1. What distinguishes a deep feedforward network from a shallow one?
C) Fast inference A) Use of convolutional layers
D) Non-sequential learning B) Presence of recurrent connections
Answer: B C) Multiple hidden layers
D) Use of dropout
Answer: C

12. What helps compute marginals in CRFs? 2. Which regularization technique involves adding a penalty equal to the absolute value
A) Backpropagation of the magnitude of coefficients?
B) Message passing A) L1 Regularization
C) Pooling B) L2 Regularization
D) Loss calculation C) Dropout
Answer: B D) Batch Normalization
Answer: A

3. Dropout helps prevent overfitting by:

13. What improves CRF performance?
A) Increasing the learning rate
A) Fewer parameters
B) Reducing the number of layers
B) More hidden states
C) Randomly setting a fraction of input units to zero during training
C) Richer features
D) Adding more neurons
D) Larger loss
Answer: C
Answer: C
4. Which layer is primarily responsible for feature extraction in CNNs?
A) Fully connected layer
14. Markov network is: B) Convolutional layer
A) Directed C) Pooling layer
B) Undirected graphical model D) ReLU layer
C) Tree-based model Answer: B
D) Decision rule
5. RNNs are particularly suited for:
Answer: B
A) Image classification
B) Sequential data processing
C) Tabular data analysis
15. Which is better for structured output problems?
D) Clustering tasks
A) CRF
Answer: B
B) SVM
C) Naive Bayes 6. What is the primary function of pooling layers in CNNs?
D) Perceptron A) To increase the size of the feature maps
Answer: A B) To reduce the spatial dimensions of the feature maps
C) To apply activation functions
D) To normalize the data
Answer: B
7. Which activation function is commonly used in deep neural networks due to its 13. Which of the following is a common issue when training deep neural networks?
simplicity and effectiveness? A) Overfitting
A) Sigmoid B) Underfitting
B) Tanh C) High bias
C) ReLU D) All of the above
D) Softmax Answer: D
Answer: C
14. What is the role of the Softmax function in neural networks?
8. In the context of deep learning, what does 'vanishing gradient' refer to? A) To introduce non-linearity
A) Gradients that become too large B) To normalize outputs into probability distributions
B) Gradients that become too small, hindering learning C) To reduce dimensionality
C) Loss of data during training D) To prevent overfitting
D) Overfitting of the model Answer: B
Answer: B
15. Which technique involves training a model on one task and then fine-tuning it on
9. Deep Belief Networks are composed of multiple layers of: another related task?
A) Convolutional layers A) Transfer Learning
B) Recurrent layers B) Regularization
C) Restricted Boltzmann Machines C) Data Augmentation
D) Decision trees D) Ensemble Learning
Answer: C Answer: A

10. Which technique is used to prevent exploding gradients in deep networks? 16. In RNNs, the problem of long-term dependencies is addressed by:
A) Dropout A) Using more layers
B) Gradient Clipping B) Applying dropout
C) Batch Normalization C) Implementing LSTM or GRU units
D) Weight Decay D) Reducing the learning rate
Answer: B Answer: C

11. Batch normalization helps in: 17. Which of the following is NOT a characteristic of Deep Belief Networks?
A) Reducing internal covariate shift A) Unsupervised pre-training
B) Increasing overfitting B) Stacked Restricted Boltzmann Machines
C) Decreasing model complexity C) Feedforward architecture
D) Eliminating the need for activation functions D) Use of convolutional layers
Answer: A Answer: D

12. The main advantage of using CNNs in image processing is: 18. The primary purpose of using activation functions in neural networks is to:
A) Their ability to handle sequential data A) Introduce non-linearity
B) Parameter sharing and spatial invariance B) Reduce computation time
C) Requirement of less data C) Normalize the output
D) Simpler architecture D) Increase the number of parameters
Answer: B Answer: A
19. Which of the following is a benefit of using dropout during training? 5. CLIP from OpenAI learns a joint representation of:
A) Faster convergence A) Audio and video
B) Reduced training time B) Image and text
C) Improved generalization C) Text and speech
D) Increased model complexity D) Image and audio
Answer: C Answer: B

20. In CNNs, the function of the ReLU activation is to: 6. Which method reduces model size without significant accuracy loss?
A) Apply dropout A) Fine-tuning
B) Normalize data B) Pruning
C) Remove negative values by outputting zero C) BatchNorm
D) Reduce overfitting D) Data Augmentation
Answer: C Answer: B

MCQs: Module 6 (30 Questions) 7. What does quantization do?

A) Converts models to float64
1. What is the core idea behind self-supervised learning? B) Increases number of layers
A) Use labeled data only C) Reduces model precision to save space
B) Learn from noisy data D) Adds noise to training data
C) Generate supervisory signals from the data itself Answer: C
D) Use reinforcement learning rewards
Answer: C 8. In knowledge distillation, the smaller model is called the:
A) Student
2. Which architecture is the backbone of models like BERT and GPT? B) Teacher
A) CNN C) Assistant
B) RNN D) Expert
C) Transformer Answer: A
D) DBN
Answer: C 9. Which task benefits most from GNNs (Graph Neural Networks)?
A) Image recognition
3. A Vision Transformer (ViT) differs from CNNs in that it: B) Sequential tagging
A) Uses convolution for patches C) Node classification
B) Ignores positional encoding D) Language translation
C) Uses self-attention to process image patches Answer: C
D) Cannot be used for classification
Answer: C 10. Which of these is a tool for model explainability?
A) Adam Optimizer
4. What is Zero-Shot Learning? B) SHAP
A) Learning without training data C) RMSProp
B) Predicting classes never seen during training D) KNN
C) Training on zero epochs Answer: B
D) Training with infinite data
Answer: B
11. Transformers rely on which mechanism to process sequences? 17. What does “transformer” eliminate compared to RNNs?
A) Convolution A) Fully connected layers
B) Recurrent cells B) Need for sequential processing
C) Attention C) Backpropagation
D) Pooling D) Attention mechanism
Answer: C Answer: B

12. What is the main challenge of training large deep models? 18. Which loss function is commonly used in contrastive learning?
A) Too much speed A) Cross-entropy
B) Overgeneralization B) Hinge loss
C) Compute and memory constraints C) Triplet loss
D) Lack of activation functions D) Contrastive loss
Answer: C Answer: D

13. Which of the following is a foundation model? 19. In explainable AI, a saliency map is used to:
A) SVM A) Visualize neuron weights
B) CNN B) Highlight important input regions
C) GPT C) Add noise
D) LDA D) Reduce training time
Answer: C Answer: B

14. In few-shot learning, models: 20. What is the role of positional encoding in Transformers?
A) Train without supervision A) Normalize input
B) Learn from a large number of examples B) Capture spatial info
C) Adapt to new tasks with few examples C) Retain word order
D) Only predict classes seen in training D) Enhance speed
Answer: C Answer: C

15. Model distillation is used to: 21. Which company introduced the ViT model?
A) Enlarge models A) Meta
B) Increase training data B) OpenAI
C) Transfer knowledge to a smaller model C) DeepMind
D) Remove layers D) Google
Answer: C Answer: D

16. The BERT model was trained using: 22. What is a major ethical concern in deep learning research?
A) Masked Language Modeling A) Overfitting
B) Sequence-to-sequence learning B) Floating point precision
C) One-shot learning C) Model fairness and bias
D) GNN-based training D) High dropout rates
Answer: A Answer: C
23. GPT models are trained using: 29. SHAP values explain:
A) Masked tokens A) Gradients
B) Next-token prediction B) Feature importance for predictions
C) Multimodal inputs C) Model weights
D) Vision transformers D) Accuracy
Answer: B Answer: B

24. Few-shot learning typically uses which algorithmic technique? 30. Which of these is a method to improve inference efficiency in large models?
A) Meta-learning A) Ensemble models
B) SGD B) Pruning
C) MLP C) Increasing hidden layers
D) DBNs D) Decreasing batch size
Answer: A Answer: B

25. Which is a challenge in deploying large models like GPT?

A) Too few parameters
B) Model compression
C) Latency and hardware requirements
D) Inflexible architecture
Answer: C

26. What is “catastrophic forgetting” in multitask learning?

A) Losing training data
B) Forgetting old tasks while learning new ones
C) Overfitting
D) Saturated neurons
Answer: B

27. A major advantage of transformer-based models is:

A) Low compute cost
B) Training on small datasets
C) Parallelization
D) Recurrence
Answer: C

28. In diffusion models (like Stable Diffusion), output is generated by:

A) Upscaling inputs
B) Reversing noise process
C) Using RNNs
D) Hash encoding
Answer: B

Common questions

Message passing in CRFs is crucial for computing marginals, as it allows the model to efficiently propagate information through the nodes, facilitating marginal probability computations and enabling efficient inference. This process is integral for performing tasks like sequence labeling by ensuring accurate probability distributions are calculated for sequences based on input data .

Deep Belief Networks (DBNs) typically use unsupervised pre-training, which involves training the network layer by layer using Restricted Boltzmann Machines (RBMs) to learn the underlying structure of the input data without labels. This pre-training helps in improving convergence during the subsequent supervised training phase, thereby minimizing overfitting and enhancing the learning of complex patterns in large networks .

The attention mechanism in transformers allows the model to weigh the influence of different parts of the input sequence dynamically, providing direct, contextually informed connections between distant sequence elements. Unlike RNNs, which process sequences sequentially and often struggle with long-term dependencies, transformers handle sequences in parallel, facilitating efficient processing and capturing complex dependencies in the data .

Regularization helps prevent overfitting by adding a penalty to the loss function, which discourages complex models. L1 regularization promotes sparsity by adding the absolute value of the weights (Lasso), leading to some weights being zeroed out, while L2 regularization adds the square of the weights (Ridge), which tends to distribute weights more evenly and prevents large weights .

Knowledge distillation is a process where a smaller model (the student) is trained to replicate the output of a larger model (the teacher). The benefits include reduced model size and improved inference speed, without significantly sacrificing accuracy. It achieves this by transferring the knowledge learned by the complex teacher model, thus retaining its predictive power in a more compact form .

The primary function of pooling layers in CNNs is to reduce the spatial dimensions of feature maps, which helps in controlling overfitting, reducing computation and memory footprints, and making the network invariant to minor changes in the position of features in the input image. This is crucial for maintaining relevant spatial hierarchies in image processing tasks .

Graph Neural Networks (GNNs) are specifically designed to operate on graph-structured data, which allows them to capture the dependencies between nodes directly. This ability to leverage the inherent graph structure is what sets them apart from traditional neural networks, making GNNs particularly effective for node classification tasks where relationships between nodes contribute significantly to the task .

Transfer learning enhances training efficiency by utilizing the pre-trained knowledge from a related task, thereby requiring less data and fewer computational resources to achieve high performance on the new task. This is especially beneficial in scenarios with limited data, as the model can leverage generalized features learned previously to adapt rapidly to new but related tasks .

The primary challenge associated with the sigmoid activation function is its susceptibility to the vanishing gradient problem, where the gradients become too small during backpropagation, thus hindering learning. This problem is typically addressed by using alternative activation functions like ReLU (Rectified Linear Unit) which do not suffer from the vanishing gradient problem .

Convolutional layers in CNNs are parameter-efficient as they share weights across spatial locations, significantly reducing the total number of parameters compared to fully connected layers. This makes them highly effective in capturing spatial hierarchies in the input data, such as identifying patterns and textures. Fully connected layers, by contrast, have a fixed number of parameters related to the input and output dimensions, often rendering them less efficient for spatial data .

Deep Learning MCQs and Concepts Guide
No ratings yet
Deep Learning MCQs and Concepts Guide
17 pages
Deep Learning Exam Questions and Answers
No ratings yet
Deep Learning Exam Questions and Answers
4 pages
Deep Learning: Automatic Feature Extraction
No ratings yet
Deep Learning: Automatic Feature Extraction
14 pages
Deep Learning Concepts and Techniques
No ratings yet
Deep Learning Concepts and Techniques
18 pages
Deep Learning Lecture 1 Revision Q&A
No ratings yet
Deep Learning Lecture 1 Revision Q&A
6 pages
Deep Learning Concepts and Techniques
No ratings yet
Deep Learning Concepts and Techniques
15 pages
MLP Model Structure and Layers
No ratings yet
MLP Model Structure and Layers
6 pages
Deep Learning MCQ Study Guide
No ratings yet
Deep Learning MCQ Study Guide
9 pages
Deep Learning MCQs and Answers
100% (1)
Deep Learning MCQs and Answers
9 pages
Deep Learning Concepts and Techniques
No ratings yet
Deep Learning Concepts and Techniques
26 pages
Deep Learning Quiz for JNTUK Students
No ratings yet
Deep Learning Quiz for JNTUK Students
3 pages
Deep Learning Basics: 45 Questions & Answers
No ratings yet
Deep Learning Basics: 45 Questions & Answers
19 pages
Deep Learning MCQ Questions Collection
No ratings yet
Deep Learning MCQ Questions Collection
7 pages
AI, ML, and DL Explained Concisely
No ratings yet
AI, ML, and DL Explained Concisely
18 pages
Deep Learning Concepts and Techniques
No ratings yet
Deep Learning Concepts and Techniques
38 pages
Deep Learning Techniques Explained
No ratings yet
Deep Learning Techniques Explained
31 pages
ANN vs. BNN: Key Differences Explained
No ratings yet
ANN vs. BNN: Key Differences Explained
8 pages
Deep Learning Exam Q&A Guide
No ratings yet
Deep Learning Exam Q&A Guide
5 pages
Deep Learning MCQs with Answers
No ratings yet
Deep Learning MCQs with Answers
2 pages
Machine Learning vs. Deep Learning Insights
No ratings yet
Machine Learning vs. Deep Learning Insights
28 pages
Top Deep Learning Interview Questions
No ratings yet
Top Deep Learning Interview Questions
16 pages
Deep Learning Skill Test Questions
No ratings yet
Deep Learning Skill Test Questions
22 pages
Understanding XOR and Deep Learning Challenges
No ratings yet
Understanding XOR and Deep Learning Challenges
57 pages
Deep Learning Viva Q&A Guide
No ratings yet
Deep Learning Viva Q&A Guide
4 pages
Deep Learning MCQs and Answers
100% (2)
Deep Learning MCQs and Answers
33 pages
Deep Learning Question Bank 2022
No ratings yet
Deep Learning Question Bank 2022
14 pages
AI Foundations Exam: MCQs and Answers
No ratings yet
AI Foundations Exam: MCQs and Answers
10 pages
Unit
No ratings yet
Unit
40 pages
Deep Learning Interview Questions Guide
No ratings yet
Deep Learning Interview Questions Guide
17 pages
Neural Network Interview Q&A Guide
No ratings yet
Neural Network Interview Q&A Guide
13 pages
Deep Learning: Key Concepts & Questions
No ratings yet
Deep Learning: Key Concepts & Questions
27 pages
100 Essential Deep Learning MCQs
No ratings yet
100 Essential Deep Learning MCQs
2 pages
Deep Learning Fundamentals & Applications
No ratings yet
Deep Learning Fundamentals & Applications
5 pages
Untitled Document
No ratings yet
Untitled Document
7 pages
Deep Learning MCQ Practice Questions
No ratings yet
Deep Learning MCQ Practice Questions
19 pages
Neural Networks MCQs for AI Exam
No ratings yet
Neural Networks MCQs for AI Exam
3 pages
Deep Learning Interview Q&A Guide
No ratings yet
Deep Learning Interview Q&A Guide
9 pages
120 Deep Learning Important Questions + Answers ?
100% (1)
120 Deep Learning Important Questions + Answers ?
68 pages
Deep Learning vs Machine Learning Guide
No ratings yet
Deep Learning vs Machine Learning Guide
11 pages
Deep Learning Overview and Key Concepts
No ratings yet
Deep Learning Overview and Key Concepts
6 pages
Deep Learning Concepts and Models Explained
No ratings yet
Deep Learning Concepts and Models Explained
2 pages
Deep Learning and Neural Network Q&A
No ratings yet
Deep Learning and Neural Network Q&A
4 pages
Deep Learning Concepts and Techniques
No ratings yet
Deep Learning Concepts and Techniques
4 pages
K-Means Clustering and Neural Networks Guide
No ratings yet
K-Means Clustering and Neural Networks Guide
6 pages
Perceptron
No ratings yet
Perceptron
26 pages
Deep Learning Placement Questions Guide
No ratings yet
Deep Learning Placement Questions Guide
7 pages
Deep Learning Exam Questions and Answers
100% (1)
Deep Learning Exam Questions and Answers
8 pages
Challenges in Training Deep Neural Networks
No ratings yet
Challenges in Training Deep Neural Networks
37 pages
Advanced AI Modeling Concepts MCQs
No ratings yet
Advanced AI Modeling Concepts MCQs
9 pages
Deep Learning: Types & Key Concepts
No ratings yet
Deep Learning: Types & Key Concepts
12 pages
Deep Learning MCQs for Knowledge Assessment
No ratings yet
Deep Learning MCQs for Knowledge Assessment
4 pages
Deep Learning Q&A: Key Concepts Explained
No ratings yet
Deep Learning Q&A: Key Concepts Explained
2 pages
CNN Mse
No ratings yet
CNN Mse
5 pages
Deep Learning Overview: Frameworks & Applications
No ratings yet
Deep Learning Overview: Frameworks & Applications
21 pages
AI, ML, and DL: Key Concepts Explained
No ratings yet
AI, ML, and DL: Key Concepts Explained
30 pages
Neural Network Basics MCQs and Answers
No ratings yet
Neural Network Basics MCQs and Answers
13 pages
Deep Learning Overview and Applications
No ratings yet
Deep Learning Overview and Applications
13 pages
Deep Learning Concepts and Applications
No ratings yet
Deep Learning Concepts and Applications
8 pages
Computer Networks Unit 1 Overview
No ratings yet
Computer Networks Unit 1 Overview
66 pages
B18 13a-1998
No ratings yet
B18 13a-1998
26 pages
Glass Integrity Inspection Checklist
No ratings yet
Glass Integrity Inspection Checklist
2 pages
Muhammad Kamran's Professional Profile
No ratings yet
Muhammad Kamran's Professional Profile
2 pages
Service Manual: RL4 Vertical Mast
No ratings yet
Service Manual: RL4 Vertical Mast
105 pages
Automotive Oscilloscope Management Seminar
No ratings yet
Automotive Oscilloscope Management Seminar
57 pages
8 MW Reference Wind Turbine Overview
No ratings yet
8 MW Reference Wind Turbine Overview
17 pages
Sophie Rain Spiderman Viral Video Leak
No ratings yet
Sophie Rain Spiderman Viral Video Leak
4 pages
SPPU Programming & Problem Solving Notes
No ratings yet
SPPU Programming & Problem Solving Notes
33 pages
Sediment Load Prediction Models Study
No ratings yet
Sediment Load Prediction Models Study
12 pages
Solving Systems with Augmented Matrices
100% (1)
Solving Systems with Augmented Matrices
2 pages
KV AFS Bihta Class XII Math Practice Paper
No ratings yet
KV AFS Bihta Class XII Math Practice Paper
10 pages
Innovations in Secondary Battery Tech
No ratings yet
Innovations in Secondary Battery Tech
12 pages
MDAS Module Training Manual
No ratings yet
MDAS Module Training Manual
22 pages
sx70 HS Exposer Sheatsheet
No ratings yet
sx70 HS Exposer Sheatsheet
8 pages
Hurricanes ProcessBook
No ratings yet
Hurricanes ProcessBook
144 pages
6MD66 Bay Control Unit Overview
No ratings yet
6MD66 Bay Control Unit Overview
36 pages
Benchmarking O&M Management with CAFM
No ratings yet
Benchmarking O&M Management with CAFM
12 pages
Discovering the 119th Element Ununennium
No ratings yet
Discovering the 119th Element Ununennium
10 pages
Mechanical Drying Methods Overview
No ratings yet
Mechanical Drying Methods Overview
3 pages
Foundations of Human-Computer Interaction
100% (1)
Foundations of Human-Computer Interaction
3 pages
FANUC R-1000/A Operator's Manual
No ratings yet
FANUC R-1000/A Operator's Manual
142 pages
Cloud Economics and AWS Billing Insights
No ratings yet
Cloud Economics and AWS Billing Insights
60 pages
Fire Fighting System Work Report
No ratings yet
Fire Fighting System Work Report
31 pages
Internet Applications and Concepts Overview
No ratings yet
Internet Applications and Concepts Overview
77 pages
Development of Power Transmission Line Defects Diagnosis System For UAV Inspection Based On Binocular Depth Imaging Technology
No ratings yet
Development of Power Transmission Line Defects Diagnosis System For UAV Inspection Based On Binocular Depth Imaging Technology
4 pages
ID-FirePumpController-1 1 1
No ratings yet
ID-FirePumpController-1 1 1
71 pages
Unlocking SAGEM SL96-9 Calculator Guide
No ratings yet
Unlocking SAGEM SL96-9 Calculator Guide
4 pages
Grade 12 Learners' Perception of Brightspace
No ratings yet
Grade 12 Learners' Perception of Brightspace
28 pages
LRP Technical Specifications
No ratings yet
LRP Technical Specifications
4 pages

Deep Learning Revision Guide

Uploaded by

Deep Learning Revision Guide

Uploaded by

Revision Topics Module 3: Training Neural Networks

• ANN (Artificial Neural Network): Composed of neurons (nodes),

• Activation Functions: Sigmoid, Tanh, ReLU, Leaky ReLU, Softmax.

• Multi-layer NN: Multiple hidden layers enable deep learning.

• Fuzzy Relations: Deals with uncertainty and imprecision.

• Cardinality: Number of elements in a fuzzy set.

Quick Revision: Quick Revision:

9. Generalization in ML refers to:

10. Which of the following is not a machine learning paradigm?

15. Overfitting occurs when: 20. Interpretability in DL is low because:

8. A multilayer perceptron is called deep if:

9. In a neural network, weights are:

10. Tanh activation function ranges from:

7. Dropout helps by:

8. Gradient descent works by:

9. Adam optimizer uses:

10. Model selection involves:

12. Risk minimization refers to: 17. Hyperparameters include:

15. Epoch refers to: 20. Backpropagation stops at:

1. CRF is a: 6. Entropy is defined as:

3. Partition function in CRF is used to: 8. Training CRFs involves maximizing:

4. Difference between HMM and CRF: 9. Hidden states in HMM are:

3. Dropout helps prevent overfitting by:

MCQs: Module 6 (30 Questions) 7. What does quantization do?

25. Which is a challenge in deploying large models like GPT?

26. What is “catastrophic forgetting” in multitask learning?

27. A major advantage of transformer-based models is:

28. In diffusion models (like Stable Diffusion), output is generated by:

Common questions

What is the significance of message passing in computing marginals for Conditional Random Fields (CRFs), and how does it contribute to inference?

What is the significance of message passing in computing marginals for Conditional Random Fields (CRFs), and how does it contribute to inference?

Can you explain the concept of supervised versus unsupervised pre-training in Deep Belief Networks (DBNs) and its significance?

Can you explain the concept of supervised versus unsupervised pre-training in Deep Belief Networks (DBNs) and its significance?

How does the attention mechanism in transformers facilitate the processing of sequences, in contrast to RNNs?

How does the attention mechanism in transformers facilitate the processing of sequences, in contrast to RNNs?

Discuss the role of regularization in reducing overfitting in neural networks, highlighting the differences between L1 and L2 regularization.

Discuss the role of regularization in reducing overfitting in neural networks, highlighting the differences between L1 and L2 regularization.

Explain the process and benefits of knowledge distillation in the context of neural networks.

Explain the process and benefits of knowledge distillation in the context of neural networks.

In the context of Convolutional Neural Networks (CNNs), what is the primary function of pooling layers, and why is it important for image processing tasks?

In the context of Convolutional Neural Networks (CNNs), what is the primary function of pooling layers, and why is it important for image processing tasks?

How do Graph Neural Networks (GNNs) benefit node classification tasks, and what sets them apart from traditional neural networks in this context?

How do Graph Neural Networks (GNNs) benefit node classification tasks, and what sets them apart from traditional neural networks in this context?

What are the mechanisms by which transfer learning enhances the training efficiency of neural networks, particularly in tasks with limited data?

What are the mechanisms by which transfer learning enhances the training efficiency of neural networks, particularly in tasks with limited data?

What are the challenges associated with the sigmoid activation function, and how do neural networks typically address these challenges?

What are the challenges associated with the sigmoid activation function, and how do neural networks typically address these challenges?

Contrast the use of convolutional layers in CNNs with fully connected layers in terms of parameter efficiency and spatial hierarchies.

Contrast the use of convolutional layers in CNNs with fully connected layers in terms of parameter efficiency and spatial hierarchies.

You might also like