RNN and LSTM
RNN and LSTM
▪ No Cycles or loop in
▪ Network No Memory about
the past
3
Recurrent Neural
Networks
▪ Can handle sequential data
▪ Considers the current input and
also the previously received
inputs
▪ Can memorize inputs due to its
internal memory
▪
4
5
WHY RNN ?
Application of RNN
Time Series
2 NLP
Text Classification, Sentiment Analysis,
Document
prediction
Summary, Answer
3 Machine Translation
Translate the input into different language
4 Image Captioning
Caption the image by analysing the activities
5 Speech Recognition
PURPOSE
RNNs are well-suited for tasks where the temporal order of the data matters. This
includes applications such as time series prediction, natural language processing,
and speech recognition.
11
An unrolled recurrent
neural network
18
19
Recurrent Neuron
20
21
22
30
▪Introduced by Hochreiter
& Schmidhuber (1997)
32
Notation
35
Input Gates
Hidden State
LSTM Breakdown Forget Gates
Output Gates
36
The sigmoid layer outputs numbers betweenzero,and one how much of each
describing
component should be let through. A value of zerolet nothing
means through
“,” while a value
of one means “let everything through!”
37
Forget Gates
What is added to the
hidden state
Input Gates
What is kept from
previous states
Hidden State
Carry previous
information
For the language model example, since it just saw a subject, it might want
to output information relevant to a verb, in case that’s what is coming next.
For example, it might output whether the subject is singular or plural, so that
we know what form a verb should be conjugated into if that’s what follows
next.
31
LSTM APPLICATIONS
CONCLUSION
In conclusion, both RNNs and LSTMs are powerful tools for processing
sequential data. RNNs maintain a cyclic structure to retain memory, but
the vanishing gradient problem limits their effectiveness over long
sequences. LSTMs address this issue with a more sophisticated
architecture, allowing them to capture and retain long-term
dependencies more effectively. LSTMs have become instrumental in
various machine learning and artificial intelligence applications, where
understanding and utilizing context over time are critical.