Deep Learning Course File
Deep Learning Course File
COURSE FILE
B. TECH: 7TH SEMESTER
Table of Contents
Sr. Page
No. Particulars No
8 Quiz question paper(s) and result (minimum 3 quizzes are to be conducted) 10-13
9 Tutorial sheets NA
Identification of Academically Weak Students and action taken for
10 improvement 14
Identification of Fast learners and efforts made to help fast learners to achieve
11 university merit positions 14
Detailed Contents:
UNIT 1: Machine Learning Basics: Learning, Under-fitting, Overfitting, Estimators, Bias,
Variance, Maximum Likelihood Estimation, Bayesian Statistics, Supervised Learning,
Unsupervised Learning and Stochastic Gradient Decent. [4hrs] (CO 1)
UNIT 5: Deep Generative Models: Boltzmann Machines, Restricted Boltzmann Machines, Deep
Belief Networks, Deep Boltzmann Machines, Sigmoid Belief Networks, Directed Generative Net,
Drawing Samples from Auto encoders. [14hrs] (CO 5)
Last Year Question Papers
Academic Calendar
Session 2024
Faculty Time Table
GULZAR GROUP OF ENGINEERING, LUDHIANA
LESSON PLAN
B. TECH CSE (AIML & DS)
DEEP LEARNING (BTCS 704-18)
CO1: Comprehend the advancements in learning techniques
CO2: Compare and explain various deep learning architectures and algorithms.
Course CO3: Demonstrate the applications of Convolution Networks
Outcomes
CO4: Apply Recurrent Network for Sequence Modelling
RMSprop, https://deepchecks.com/glossary/rmsprop/
List of Registered Students
Branch/ B. Tech. -
CSE DEEP LEARNING Semester:7th Section/Group: A
Course:
(AIML & DS)
Sr. No. Roll No. Name of Student Contact No Email
Abhishek Kumar
2 2121932 Mishra
7 2121937 Ghansham
HARSHIT KUMAR
8 2121938 MISHRA
13 2121943 Krishu
23 2121958 SHAILENDRA
26 2121961 SONU
29 2121965 Vaseem
44 2221668 Tanvi
Quiz 1
Which of the following is a popular activation function used in deep learning models?
a) Softmax
b) ReLU
c) Tanh
d) All of the above
Which of the following techniques is used to prevent the vanishing gradient problem?
a) Batch normalization
b) ReLU activation function
c) LSTM cells
d) All of the above
Which type of neural network is best suited for sequential data, such as time series or
text?
a) The model performs well on the training data but poorly on new, unseen data
b) The model performs well on both training and test data
c) The model performs poorly on both training and test data
d) The model's complexity is too low
a) Gradient Descent
b) Stochastic Gradient Descent (SGD)
c) Adam
d) All of the above
Quiz 2
Which of the following is a commonly used loss function for binary classification
problems?
In the context of Convolutional Neural Networks (CNNs), what does "stride" refer
to?
Which type of layer is typically used in a neural network to reduce the risk of
overfitting?
a) Convolutional layer
b) Dense layer
c) Dropout layer
d) Pooling layer
In the context of Recurrent Neural Networks (RNNs), what does "vanishing gradient" refer
to?
What is the key feature of Long Short-Term Memory (LSTM) networks that helps them
handle long-term dependencies?
a) Convolutional layers
b) Gated units that control the flow of information
c) Dropout regularization
d) Large number of layers
Which of the following techniques is used to augment training data in image classification
tasks?
a) Dropout
b) Batch normalization
c) Data augmentation
d) Early stopping
a) Sigmoid
b) Tanh
c) ReLU
d) k-means
What is the main advantage of using a Convolutional Neural Network (CNN) for
image processing tasks?
a) L1 regularization
b) L2 regularization
c) Dropout regularization
d) Data augmentation
In the context of neural networks, what does "gradient descent" refer to?
1
Abhinav Thakur 2121931
2 Abhishek Kumar
Mishra 2121932
3
Ashish Sharma 2121933
4
Bhavya Rathore 2121934
5
Dhanraj Verma 2121935
7
Ghansham 2121937
8 HARSHIT KUMAR
MISHRA 2121938
9
Hartej Singh Gill 2121939
INDRA PRASAD
10 THARU 2121940
11
Jasmeet Kaur 2121941
Khushpreet
12 Kaur
2121942
13 Krishu 2121943
15
MD Rahil Khan 2121946
Md Shadakat
16 Ekwal 2121947
17 Mohd Amir 2121948
23 SHAILENDRA 2121958
26 SONU 2121961
29 Vaseem 2121965
44 Tanvi 2221668
Academically Weak Students
Identified as those having less than 40% marks in MST(s)
Student
attendance
Sr. No. Name of Student Roll Number MST(s) marks
1
Fast Learners
Identified as those having good past results and MST(s) score
Branch/ Course: Semester:
University
Position last year
if any
Sr. No. Name of Student Roll Number MST(s) marks
1
2
3
4
5
Total No.
Attended % Of
Of
Classes Attendance
Sr. No. Student Name Roll No. Classes
7 Ghansham 2221632
13 Krishu 2221639
14 Md anas alam 2221643
23 SHAILENDRA 2221655
26 SONU 2221659
29 Vaseem 2221662
44 Tanvi 2221668
Subject Notes
Deep learning is a branch of machine learning which is completely based on artificial neural
networks, as neural networks are going to mimic the human brain so deep learning is also a
kind of mimic of the human brain.
This Deep Learning tutorial is your one-stop guide for learning everything about Deep
Learning. It covers both basic and advanced concepts, providing a comprehensive
understanding of the technology for both beginners and professionals. Whether you’re new
to Deep Learning or have some experience with it, this tutorial will help you learn about
different technologies of Deep Learning with ease.
What is Deep Learning?
Deep Learning is a part of Machine Learning that uses artificial neural networks to learn
from lots of data without needing explicit programming. These networks are inspired by the
human brain and can be used for things like recognizing images, understanding speech, and
processing language. There are different types of deep learning networks, like feedforward
neural networks, convolutional neural networks, and recurrent neural networks. Deep
Learning needs lots of labeled data and powerful computers to work well, but it can achieve
very good results in many applications.
The reasons why deep learning has become the industry standard:
Handling unstructured data: Models trained on structured data can easily learn from
unstructured data, which reduces time and resources in standardizing data sets.
Handling large data: Due to the introduction of graphics processing units (GPUs), deep
learning models can process large amounts of data with lightning speed.
High Accuracy: Deep learning models provide the most accurate results in computer visions,
natural language processing (NLP), and audio processing.
Pattern Recognition: Most models require machine learning engineer intervention, but deep
learning models can detect all kinds of patterns automatically.
In this tutorial, we are going to dive into the world of deep learning and discover all the key
concepts required for you to start a career in artificial intelligence (AI). If you're looking to
learn with some practical exercises, check out our course, An Introduction to Deep
Learning in Python.
Before diving into the intricacies of deep learning algorithms and their applications, it's
essential to understand the foundational concepts that make this technology so revolutionary.
This section will introduce you to the building blocks of deep learning: neural networks, deep
neural networks, and activation functions.
Neural networks
At the heart of deep learning are neural networks, which are computational models inspired
by the human brain. These networks consist of interconnected nodes, or "neurons," that work
together to process information and make decisions. Just like our brain has different regions
for different tasks, a neural network has layers designated for specific functions.
We have a full guide, What are Neural Networks, which covers the essentials in more
detail.
What makes a neural network "deep" is the number of layers it has between the input and
output. A deep neural network has multiple layers, allowing it to learn more complex features
and make more accurate predictions. The "depth" of these networks is what gives deep
learning its name and its power to solve intricate problems.
Our introduction to deep neural networks tutorial covers the significance of DNNs in deep
learning and artificial intelligence.
Activation functions
In a neural network, activation functions are like the decision-makers. They determine what
information should be passed along to the next layer. These functions add a level of
complexity, enabling the network to learn from the data and make nuanced decisions.
Deep learning uses feature extraction to recognize similar features of the same label and then
uses decision boundaries to determine which features accurately represent each label. In the
cats and dogs classification, the deep learning models will extract information such as the
eyes, face, and body shape of animals and divide them into two classes.
The deep learning model consists of deep neural networks. The simple neural network
consists of an input layer, a hidden layer, and an output layer. Deep learning models consist of
multiple hidden layers, with additional layers that the model's accuracy has improved.
The input layers contain raw data, and they transfer the data to hidden layers' nodes. The
hidden layers' nodes classify the data points based on the broader target information, and with
every subsequent layer, the scope of the target value narrows down to produce accurate
assumptions. The output layer uses hidden layer information to select the most probable label.
In our case, accurately predicting a dog's image rather than a cat's.
Recently, the world of technology has seen a surge in artificial intelligence applications, and
they all are powered by deep learning models. The applications range from recommending
movies on Netflix to Amazon warehouse management systems.
In this section, we are going to learn about some of the most famous applications built using
deep learning. This will help you realize the full potential of deep neural networks.
Computer Vision
Computer vision (CV) is used in self-driving cars to detect objects and avoid collisions. It is
also used for face recognition, pose estimation, image classification, and anomaly detection.
Generative AI has seen a surge in demand as CryptoPunk NFT just sold for $1 million.
CryptoPunk is a generative art collection that was created using deep learning models. The
introduction of the GPT-4 model by OpenAI has revolutionized the text generation domain
with its powerful ChatGPT tool; now, you can teach models to write an entire novel or even
write code for your data science projects.
Translation
Deep learning translation is not limited to language translation, as we are now able to translate
photos to text by using OCR, or translate text to images by using NVIDIA GauGAN2 .
Time series forecasting is used for predicting market crashes, stock prices, and changes in
the weather. The financial sector survives on speculation and future projections. Deep
learning and time series models are better than humans in detecting patterns and so are pivotal
tools in this and similar industries.
Let's learn about different types of deep learning models and how they work.
Supervised Learning
Supervised learning uses a labeled dataset to train models to either classify data or predict
values. The dataset contains features and target labels, which allow the algorithm to learn over
time by minimizing the loss between predicted and actual labels. Supervised learning can be
divided into classification and regression problems.
Classification
The classification algorithm divides the dataset into various categories based on feature
extractions. The popular deep learning models are ResNet50 for image classification
and BERT (language model)) for text classification.
Regression
Instead of dividing the dataset into categories, the regression model learns the relationship
between input and output variables to predict the outcome. Regression models are commonly
used for predictive analysis, weather forecasting, and predicting stock market
performance. LSTM and RNN are popular deep learning regression models.
Unsupervised Learning
Unsupervised learning algorithms learn the pattern within an unlabeled dataset and create
clusters. Deep learning models can learn hidden patterns without human intervention and
these models are often used in recommendation engines.
Unsupervised learning is used for grouping various species, medical imaging, and market
research. The most common deep learning model for clustering is the deep embedded
clustering algorithm.
Reinforcement Learning
Reinforcement learning (RL) is a machine learning method where agents learn various
behaviors from the environment. This agent takes random actions and gets rewards. The agent
learns to achieve goals by trial and error in a complex environment without human
intervention.
Just like a baby with encouragement from its parents learns to walk, the AI learns to perform
certain tasks by maximizing rewards, and the designer sets the rewards policy. Recently, RL
has seen high demands in automation due to advancements in robotics, self-driving cars,
defeating pro players in games, and landing rockets back to earth.
Generative adversarial networks (GANs) use two neural networks, and together, they
produce synthetic instances of original data. GANs have gained a lot of popularity in recent
years as they are able to mimic some of the great artists to produce masterpieces. They are
widely used for generating synthetic art, video, music, and texts. Learn more about real work
applications at Generative Adversarial Networks Tutorial.
Basically Deep Feed-Forward Networks (I will use the abbreviation DFN for the rest of the
article) are such neural networks which only uses input to feed forward through a function,
let’s say f*, but only through forward. There is no feedback mechanism in DFN. There are
indeed such cases when we have feedback mechanism from the output, that are
called Recurrent Neural Networks (I am also planning to write about that later).
We can find the final output value by initializing input variables and accordingly computing
nodes of the graph.
Computational Graphs in Deep Learning
Computations of the neural network are organized in terms of a forward pass or forward
propagation step in which we compute the output of the neural network, followed by a
backward pass or backward propagation step, which we use to compute
gradients/derivatives. Computation graphs explain why it is organized this way.
This gives us an idea of how computational graphs make it easier to get the derivatives
using backpropagation.
Types of computational graphs:
Type 1: Static Computational Graphs
Involves two phases:-
o Phase 1:- Make a plan for your architecture.
o Phase 2:- To train the model and generate predictions, feed it a lot of data.
The benefit of utilizing this graph is that it enables powerful offline graph optimization and
scheduling. As a result, they should be faster than dynamic graphs in general.
The drawback is that dealing with structured and even variable-sized data is unsightly.
Type 2: Dynamic Computational Graphs
As the forward computation is performed, the graph is implicitly defined.
This graph has the advantage of being more adaptable. The library is less intrusive and
enables interleaved graph generation and evaluation. The forward computation is
implemented in your preferred programming language, complete with all of its features and
algorithms. Debugging dynamic graphs is simple. Because it permits line-by-line execution
of the code and access to all variables, finding bugs in your code is considerably easier. If
you want to employ Deep Learning for any genuine purpose in the industry, this is a must-
have feature.
The disadvantage of employing this graph is that there is limited time for graph
optimization, and the effort may be wasted if the graph does not change.
Once, the above process is done, we again perform the forward pass to find if we obtain the
actual output as 0.5.
While performing the forward pass again, we obtain the following values:
y3 = 0.57
y4 = 0.56
y5 = 0.61
We can clearly see that our y5 value is 0.61 which is not an expected actual output, So again
we need to find the error and backpropagate through the network by updating the weights
until the actual output is obtained.
𝐸𝑟𝑟𝑜𝑟=𝑦𝑡𝑎𝑟𝑔𝑒𝑡–𝑦5Error=ytarget–y5
= 0.5 – 0.61
= -0.11
This is how the backpropagate works, it will be performing the forward pass first to see if
we obtain the actual output, if not we will be finding the error rate and then
backpropagating backwards through the layers in the network by adjusting the weights
according to the error rate. This process is said to be continued until the actual output is
gained by the neural network.
What are recurrent neural networks?
A recurrent neural network (RNN) is a type of artificial neural network which uses
sequential data or time series data. These deep learning algorithms are commonly used for
ordinal or temporal problems, such as language translation, natural language processing (nlp),
speech recognition, and image captioning; they are incorporated into popular applications
such as Siri, voice search, and Google Translate. Like feedforward and convolutional neural
networks (CNNs), recurrent neural networks utilize training data to learn. They are
distinguished by their “memory” as they take information from prior inputs to influence the
current input and output. While traditional deep neural networks assume that inputs and
outputs are independent of each other, the output of recurrent neural networks depend on the
prior elements within the sequence. While future events would also be helpful in determining
the output of a given sequence, unidirectional recurrent neural networks cannot account for
these events in their predictions.
Let’s take an idiom, such as “feeling under the weather”, which is commonly used when
someone is ill, to aid us in the explanation of RNNs. In order for the idiom to make sense, it
needs to be expressed in that specific order. As a result, recurrent networks need to account
for the position of each word in the idiom and they use that information to predict the next
word in the sequence.
Through this process, RNNs tend to run into two problems, known as exploding gradients and
vanishing gradients. These issues are defined by the size of the gradient, which is the slope of
the loss function along the error curve. When the gradient is too small, it continues to become
smaller, updating the weight parameters until they become insignificant—i.e. 0. When that
occurs, the algorithm is no longer learning. Exploding gradients occur when the gradient is
too large, creating an unstable model. In this case, the model weights will grow too large, and
they will eventually be represented as NaN. One solution to these issues is to reduce the
number of hidden layers within the neural network, eliminating some of the complexity in the
RNN model.
Long short-term memory (LSTM): This is a popular RNN architecture, which was
introduced by Sepp Hochreiter and Juergen Schmidhuber as a solution to vanishing gradient
problem. In their paper (link resides outside ibm.com), they work to address the problem of
long-term dependencies. That is, if the previous state that is influencing the current prediction
is not in the recent past, the RNN model may not be able to accurately predict the current
state. As an example, let’s say we wanted to predict the italicized words in following, “Alice
is allergic to nuts. She can’t eat peanut butter.” The context of a nut allergy can help us
anticipate that the food that cannot be eaten contains nuts. However, if that context was a few
sentences prior, then it would make it difficult, or even impossible, for the RNN to connect
the information. To remedy this, LSTMs have “cells” in the hidden layers of the neural
network, which have three gates–an input gate, an output gate, and a forget gate. These gates
control the flow of information which is needed to predict the output in the network. For
example, if gender pronouns, such as “she”, was repeated multiple times in prior sentences,
you may exclude that from the cell state.
Gated recurrent units (GRUs): This RNN variant is similar the LSTMs as it also works to
address the short-term memory problem of RNN models. Instead of using a “cell state”
regulate information, it uses hidden states, and instead of three gates, it has two—a reset gate
and an update gate. Similar to the gates within LSTMs, the reset and update gates control how
much and which information to retain.
Types of RBM :
There are mainly two types of Restricted Boltzmann Machine (RBM) based on the types of
variables they use:
1. Binary RBM: In a binary RBM, the input and hidden units are binary variables. Binary
RBMs are often used in modeling binary data such as images or text.
2. Gaussian RBM: In a Gaussian RBM, the input and hidden units are continuous variables
that follow a Gaussian distribution. Gaussian RBMs are often used in modeling continuous
data such as audio signals or sensor data.
Apart from these two types, there are also variations of RBMs such as:
1. Deep Belief Network (DBN): A DBN is a type of generative model that consists of
multiple layers of RBMs. DBNs are often used in modeling high-dimensional data such as
images or videos.
2. Convolutional RBM (CRBM): A CRBM is a type of RBM that is designed specifically for
processing images or other grid-like structures. In a CRBM, the connections between the
input and hidden units are local and shared, which makes it possible to capture spatial
relationships between the input units.
3. Temporal RBM (TRBM): A TRBM is a type of RBM that is designed for processing
temporal data such as time series or video frames. In a TRBM, the hidden units are
connected across time steps, which allows the network to model temporal dependencies in
the data.
Course File (Part B)
Table of Contents
Sr. Page
No. Particulars No
3 CO PO Mappings
7 MST's results
13 Assignments Result
To be the department of choice for students opting for computer science engineering education.
Departmental Mission
To prepare quality computer science professionals, who depending upon their choice will be
readily employable by the industry, or will venture for higher studies or entrepreneurship.
Department of Computer Science & Engineering
PROGRAM EDUCATION OBJECTIVES (PEO)
PEO1: To impart exhaustive knowledge of Computer Science & Engineering, applied sciences
and humanities as well as management abilities.
PEO2: To enable students to understand, analyze and solve real life problems in Computer
Science & Engineering through hands-on practice in laboratories.
PSO1. Students should be able to understand the principles, concepts, knowledge gained
during the course of the program to analyze, specify, design, develop, test and maintain real
life engineering problems/applications relating to industry/research/education work using
appropriate data structures, algorithms and latest technologies.
UNITS TO CO MAPPING
1 1 0 0 0 0
II 0 1 0 0 0
III 0 0 1 0 0
IV 0 0 0 1 0
V 0 0 0 0 1
MST Question Paper
MST Question Paper with Cos
MST questions mapping with COs
S.No. Question List Marks CO1 CO2 CO3 CO4 *BLOOM's Taxonomy
1 Ques 1 2
2 Ques 2 2
3 Ques 3 2
4 Ques 4 2
5 Ques 5 4
6 Ques 6 4
7 Ques 7 4
8 Ques 8 8
9 Ques 9 8
MST 1 Marks
Section Name 21iNurture2 & 4
Program Name B. Tech. - CSE (AIML & DS)
Semester 7th
Deep Learning
S. No. Roll No. Name Milandeep Kour
24
10
11
12
13
14
15
1. Implementing a Custom Layer: Implement a custom neural network layer in a deep learning
framework of your choice (e.g., TensorFlow, PyTorch). Explain the functionality of this layer
and provide an example of how it can be used in a neural network architecture.
2. GAN Training Stability: Train a Generative Adversarial Network (GAN) on a dataset of
your choice. Describe the challenges you faced in training the GAN, including any stability
issues. What techniques did you use to stabilize the training process?
3. Autoencoders: Build an autoencoder for image compression using a deep learning
framework. Evaluate the performance of your autoencoder in terms of compression ratio and
reconstruction quality. Discuss the trade-offs between the depth of the network and the quality
of the compressed images.
4. Hyperparameter Tuning: Choose a deep learning model and a corresponding dataset.
Perform hyperparameter tuning using techniques such as grid search, random search, or
Bayesian optimization. Report on the process and findings, including which hyperparameters
had the most significant impact on performance.
5. Sequence-to-Sequence Models: Implement a sequence-to-sequence model for a language
translation task. Discuss the challenges associated with training such models and how
attention mechanisms can help improve their performance. Provide an evaluation of your
model's performance on a test set.
Theoretical Questions
1. Bias-Variance Tradeoff: Explain the bias-variance tradeoff in the context of deep learning.
How does model complexity affect bias and variance? Provide a detailed analysis of this
tradeoff using mathematical formulations and graphical representations.
2. Activation Functions: Discuss the role of activation functions in deep learning. Compare and
contrast different activation functions such as ReLU, sigmoid, tanh, and their variants.
Provide examples of scenarios where one activation function might be more appropriate than
others.
3. Convergence and Initialization: Analyze the impact of weight initialization on the
convergence of deep learning models. Discuss different weight initialization techniques, such
as Xavier and He initialization. Provide experimental results to support your analysis.
4. Loss Functions: Discuss the importance of choosing the right loss function for a given task in
deep learning. Compare different loss functions used for classification and regression tasks.
Explain how imbalanced datasets can affect the choice of loss function and model
performance.
5. Regularization Techniques: Examine various regularization techniques used in deep
learning, such as L1/L2 regularization, dropout, batch normalization, and data augmentation.
Discuss the theoretical underpinnings of these techniques and provide examples of how they
improve model generalization.
Assignment Mapping CO & BLOOM's Taxonomy
Branch/ Course: B. Tech CSE (AIML & DS) Semester: 7th
Assignment No.:
S.No. Question List Marks CO1 CO2 CO3 CO4 CO5 *BLOOM's Taxonomy
1 Ques 01 30
2 Ques 02 30
3 Ques 03 30
Assignments Result
Marks: Assignment 1
Semester 6th
Deep Learning
30