0% found this document useful (0 votes)
3 views

DL_Cie2

The document discusses key concepts in deep learning, including underfitting, overfitting, bias, and variance, emphasizing the importance of balancing these factors for model generalization. It covers techniques to prevent overfitting, such as early stopping and dropout, and introduces frameworks like TensorFlow and Keras for building neural networks. Additionally, it explores various neural network architectures, their applications, and specific models like autoencoders, GANs, LSTMs, and GRUs.

Uploaded by

Fahad King
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

DL_Cie2

The document discusses key concepts in deep learning, including underfitting, overfitting, bias, and variance, emphasizing the importance of balancing these factors for model generalization. It covers techniques to prevent overfitting, such as early stopping and dropout, and introduces frameworks like TensorFlow and Keras for building neural networks. Additionally, it explores various neural network architectures, their applications, and specific models like autoencoders, GANs, LSTMs, and GRUs.

Uploaded by

Fahad King
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Deep Learning CIE-2

1(a) Underfitting, Overfitting, Bias, and Variance:


 Under-fitting occurs when a model is too simple to capture the
underlying patterns in the data, leading to poor performance on
both training and testing datasets.
 Over-fitting happens when a model is too complex, capturing noise
in the training data, which reduces its ability to generalize to new
data.
 Bias is the error introduced due to assumptions in the model. High
bias leads to underfitting.
 Variance is the error due to sensitivity to small fluctuations in the
training set. High variance leads to overfitting.
A balance between bias and variance is crucial for a model's
generalization.

1(b) Preventing Overfitting in Deep Neural Nets using


Early Stopping and Dropout:

 Early Stopping monitors validation performance during training and


halts training once the performance stops improving, avoiding
overfitting.
 Dropout is a regularization technique where randomly selected
neurons are ignored during training, reducing dependency on
specific neurons and improving generalization.
These methods ensure the model does not memorize the training data
but rather learns patterns.

1(c) TensorFlow, Keras, and TensorFlow Operations:


 TensorFlow is a powerful open-source library for numerical
computation and machine learning, enabling the creation of
computational graphs.
 Keras is a high-level API within TensorFlow designed for building and
training neural networks easily.
 TensorFlow Operations include tensor manipulations, linear algebra,
and training functions for deep learning, facilitating efficient
computation on CPUs and GPUs.
1(d) Why Vanilla Neural Networks Do Not Scale?

 Ans: Vanilla neural networks have limitations in handling high-


dimensional data and require large amounts of parameters, making
them computationally expensive.
 They lack spatial hierarchies, which are crucial for image and
sequence data, leading to poor performance on complex tasks.
 Scaling vanilla networks increases training time and memory
requirements, making them impractical for large-scale applications.

1(e) Filters, Strides, Padding, and Pooling:

 Filters are kernels that extract features from input data by


convolution operations.
 Strides determine the step size of the filter movement across the
input data.
 Padding adds extra border pixels to the input to control the spatial
size of output features.
 Max Pooling extracts the maximum value from each region of a
feature map, reducing dimensionality.
 Average Pooling computes the average of values in a region,
emphasizing overall trends rather than extremes.

1(f) Applications of Large Neural Networks:


Ans:
 Large neural networks are used in natural language processing (NLP)
for tasks like language translation and sentiment analysis.
 They power image recognition systems in medical imaging and self-
driving cars.
 In speech processing, they enable real-time speech-to-text
conversion.
 They are pivotal in game-playing AI, such as AlphaGo.
 These networks are also applied in recommendation systems for e-
commerce and streaming services.
Long Answer Questions:

2. Training of Unsupervised Pretrained Networks (UPN):


Ans:
 Unsupervised Pretrained Networks (UPNs) leverage unsupervised
 learning to train a model on unlabeled data before fine-tuning it for
supervised tasks.
 In the first phase, UPNs learn a representation of the input data
without using any labels. Common methods include autoencoders
and restricted Boltzmann machines (RBMs).
 The network's weights are initialized by training layer-by-layer, a
process called greedy layer-wise pretraining. Each layer uses the
output of the previous layer as its input.
 Once pretraining is complete, the entire network is fine-tuned using
labeled data and supervised learning to improve performance on the
target task.
 This approach combats issues like poor initialization and overfitting,
especially in scenarios with limited labeled data.
 UPNs are effective in dimensionality reduction, anomaly detection,
and feature extraction.
 Examples include Deep Belief Networks (DBNs) and Stacked
Autoencoders. These architectures demonstrate the ability to
achieve better generalization and efficiency.

3. Recursive Neural Network (RNN):


 Recursive Neural Networks (RecNNs) are structured models designed
to operate on hierarchical input, such as trees.
 Each node in the tree is processed recursively, with its output
determined by combining information from its child nodes.
 They are commonly used in applications like natural language
processing (NLP), where input data such as sentences can be
represented as parse trees.
 A tree-structured RecNN can compute a vector representation for a
sentence by processing words and combining them using learned
weight matrices.
 RecNNs utilize shared weights, reducing the number of parameters
and enabling the model to generalize across different tree structures.
 Applications include sentiment analysis, syntax parsing, and semantic
analysis.
 Challenges in training RecNNs include handling variable tree
structures and avoiding vanishing gradients in deep hierarchies.

4. Convolutional Neural Networks (CNNs):


 Convolutional Neural Networks (CNNs) are specialized neural
networks designed for processing structured grid data like images.
 CNNs use convolutional layers, where filters slide over the input to
extract features like edges, textures, and shapes.
 They employ pooling layers, such as max pooling and average
pooling, to reduce the spatial dimensions of feature maps, making
computations efficient.
 A fully connected layer at the end maps extracted features to class
probabilities in tasks like classification.
 Techniques like padding ensure that the spatial dimensions of the
output remain consistent after convolution operations.
 CNNs are widely used in image recognition, object detection, and
video processing.
 Advanced architectures like ResNet, AlexNet, and VGGNet have
demonstrated state-of-the-art performance in computer vision.

5. Recurrent Neural Networks (RNNs):


 Recurrent Neural Networks (RNNs) are designed to handle
sequential data by maintaining a memory of previous inputs through
hidden states.
 At each time step, an RNN processes input and combines it with the
previous hidden state to update the current hidden state.
 RNNs are particularly effective in time series prediction, speech
recognition, and natural language processing tasks.
 However, standard RNNs suffer from vanishing and exploding
gradient problems, limiting their ability to model long-term
dependencies.
 Variants like LSTMs (Long Short-Term Memory networks) and GRUs
(Gated Recurrent Units) address these issues by introducing gating
mechanisms to control information flow.
 Training RNNs requires techniques like backpropagation through
time (BPTT), which unfolds the network across time steps to
calculate gradients.

6. Write short notes on:

(a) Autoencoders:
 Auto-encoders are unsupervised models that learn a compressed
representation (encoding) of input data.
 They consist of an encoder, which compresses the input, and a
decoder, which reconstructs it.
 Applications include dimensionality reduction, denoising, and
anomaly detection.

(b) GAN (Generative Adversarial Networks):


 GANs consist of two networks: a generator that creates data and a
discriminator that distinguishes real from generated data.
 These models are widely used in image synthesis, data augmentation,
and creating realistic simulations.

(c) LSTM (Long Short-Term Memory):


 LSTMs are a type of RNN designed to capture long-term
dependencies in sequences.
 They use gates (input, forget, and output) to control the flow of
information, addressing vanishing gradient issues.

(c) GRU (Gated Recurrent Units):


 GRUs are a simplified variant of LSTMs with fewer gates, making
them computationally efficient.
 They are effective in modeling sequential data and exhibit
performance comparable to LSTMs.

You might also like