0% found this document useful (0 votes)

71 views23 pages

LSTM Calculations and Applications

The document discusses Long Short Term Memory (LSTM) networks, a type of recurrent neural network designed to handle sequential data and long-term dependencies. It outlines the architecture of LSTM, including its cell state and three gates (forget, input, and output) that control information flow. Additionally, it highlights various applications of LSTM in fields such as time series prediction, speech recognition, and natural language processing.

Uploaded by

quan.nguyendn2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views23 pages

LSTM Calculations and Applications

Uploaded by

quan.nguyendn2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Chapter 12

Deep Neural Networks

(Part II)
LSTM
[Link]. Dương Tuấn Anh
7/2021
Outline

• 1. Long-short-term-memory (LSTM)
• 2. Applications of deep neural networks
• 3. Conclusions

2
1. Long Short Term Memory (LSTM)
Recurrent neural network
Recurrent neural network is a kind of artificial neural network
which is designed to deal with sequential series. An example of
stock price series is described in Figure 12.17.

Figure 12.18 (a) Feed-forward neural network (b) recurrent neural

network
Recurrent neural network: the output of a hidden unit can play the
role of the value of an input unit.
4
Recurrent neural network

• The internal operations of the recurrent cell in RNN

are described as in Figure 12.19.

Figure 12.19 Internal operation of a traditional RNN cell.

5
Recurrent neural network (cont.)

• Assume that 𝑥 = 𝑥1 , 𝑥2 , … , 𝑥𝑡 represents a data

sequence of length 𝑡, and ℎ𝑡 is the value of a hidden
node in RNN at time step 𝑡, the value of a hidden
node (i.e. the information stored in RNN) is
recomputed in every time step by the following
equation:
ℎ𝑡 = 𝜎(𝑊𝑥 𝑥𝑡 + 𝑊ℎ ℎ𝑡−1 + 𝑏𝑡 )
where 𝜎 is the activation function (e.g. sigmoid
function, tanh function, or ReLU (rectified linear unit)
function), 𝑊𝑥 and 𝑊ℎ are the adjustable weight vectors,
the input vector xt and 𝑏𝑡 is the bias vector.
6
ReLU function
• ReLU (rectified linear Unit) function is defined as follows:
(x) = max(0, x)

Figure 12.20 ReLU function 7

Recurrent neural network (cont.)
• As for RNN with some hidden layers, the training
process using Back-propagation through time algorithm
incurs the problem of the exploding or vanishing
gradients.
• Therefore, LSTM network has been proposed to
prevent the weaknesses of the recurrent neural
networks.
• Each LSTM unit consists of a cell state or cell memory
and 3 gates.
• A cell state in LSTM network is equivalent to a hidden
unit in RNN

8
LSTM network
• LSTM is the improved version of RNN, which was
proposed in 1997 by Hochreiter and Schmidhuber to deal
with the long-term dependencies of sequential series data.
• The traditional RNN can not remember the long-term
dependencies among data values in a long sequence,
therefore, the first data value in a sequence does not have
significant influence on the predicted values for the data at
the next time steps.
• LSTM network consists of several cell states which can
represent time-dependent information as hidden nodes in
RNN.

9
LSTM cell

• The main block of LSTM is called cell state. LSTM

can use or ignore the information which flows through
the gates and that is the method that control the
information flow with the LSTM cells. It uses sigmoid
activation function with the range [0, 1] to represent
the information which is allowed to flow. If sigmoid
function gives the value 0, there is no information
going through and if the function gives the value 1, all
the information is allowed to go through.
• Each LSTM cell consists of 3 gates to control and
protect the cell state: forget gate, input gate and
output gate.

10
LSTM cell

Figure 12.21 LSTM cell (block) 11

Forget gate

• Forget gate can control which elements of the cell

state vector will be forgotten.
• In Equation (1) 𝑓𝑡 is a resulting vector for forget gate
at the current time step, 𝜎 is the sigmoid function,
𝐶𝑡−1 and ℎ𝑡−1 represent cell state and output of
hidden unit at the previous time step, 𝑊𝑓 and 𝑏𝑓
represent the weight vector and the bias vector from
the input layer to the forget gate
𝑓𝑡 = 𝜎 𝑊𝑓 𝐶𝑡−1 , ℎ𝑡−1 , 𝑥𝑡 + 𝑏𝑓 (1)

12
Input gate
• Input gate can control the information which should enter
the cell state. This gate has two part: sigmoid layer and
tanh layer. Sigmoid layer selects the information
from ℎ𝑡−1 , 𝑥𝑡 and 𝐶𝑡−1 . Tanh layer generates the
candidate values which are added to the cell memory. The
output values of sigmoid layer and tanh layer are
computed as follows:
𝑖𝑡 = 𝜎(𝑊𝑖 𝐶𝑡−1 , ℎ𝑡−1 , 𝑥𝑡 + 𝑏𝑖 ) (2)
Ĉ𝑡 = tanh(𝑊𝑐 ℎ𝑡−1 , 𝑥𝑡 + 𝑏𝑐 ) (3)
where 𝑖𝑡 is the output value of the input gate; 𝑊𝑖 and 𝑏𝑖
represent weight vector and bias vector of the input. 𝑊𝑐 and
𝑏𝑐 in (3) represent weight vector and bias vector of the cell
state.

13
Cell state

• The cell state at the previous time step, denoted as

𝐶𝑡−1 , can be updated to be 𝐶𝑡 .
• This can be done by multiplying the value of the cell
state at the previous time step Ct-1 with ft and add it
with 𝑖𝑡 ∗ Ĉt , to become the new information needed to
be stored. This step is described as the following
equation.
𝐶𝑡 = 𝑓𝑡 ∗ 𝐶𝑡−1 + 𝑖𝑡 ∗ Ĉ𝑡 (4)

14
Output gate

• Output gate can determine which information from

the cell state can be chosen to be the output value of
the LSTM cell.
• In the Equations (5) and (6), 𝑜𝑡 is the output value of
the output gate, 𝑊𝑜 and 𝑏𝑜 represent weight vector
and bias vector from input gate to output gate, and
ℎ𝑡 is the output of the hidden layer at the current time
step.
𝑜𝑡 = 𝜎(𝑊𝑜 𝐶𝑡 , ℎ𝑡−1 , 𝑥𝑡 + 𝑏𝑜 (5)
ℎ𝑡 = 𝑜𝑡 ∗ tanh(𝐶𝑡 ) (6)

15
Architecture of LSTM network

• LSTM network contains more than one hidden layers. It

consists many layers and each layer consists of a number
of LSTM cells such that output values of the previous layer
become input values for the next layer.
• Architecture of LSTM network is described as in Figure
12.22.

16
Architecture of LSTM network

Figure 12.22 Architecture of LSTM network

17
Applications of LSTM networks

• Robot control
• Time series prediction
• Speech recognition.
• Music composition
• Grammar learning
• Natural Language processing
• Handwriting recognition
• Human action recognition
• Sentiment analysis

18
Time Series Forecasting with LSTM

Figure 12.23: Training LSTM for time

series forecasting

19
Training LSTM for time series forecasting

• Figure 12.23 illustrates one iteration step in the

training process of the LSTM. A random batch of
input data x consisting of m independent training
samples (depicted by the colours) is used in each
step. Each training sample consists of n data points
and one target value (yobs) to predict. The loss is
computed from the target value and the network’s
predictions ysim and is used to update the network
parameters.

EX: A time series : 2 3 5 4 6 8 5 7 11 13 9 7

20
2. Applications of Deep Neural
Networks
Applications:
• Speech and audio: speech recognition, audio
and music processing.
• Image and video: image recognition, computer
vision.
• Language modeling: machine translation, text
information retrieval.
• Time series prediction

21
3. Conclusions
• Building/learning deep architectures and hierarchies of
features is highly desirable.
• CNN is suitable to image data and LSTM is suitable to
sequential data.
• Deep learning is an emerging technology. Despite the
promising results reported so far, much need to be developed.
• The current optimization techniques for learning deep
architectures should be improved.
• To make deep learning techniques scalable to very large
training data, sound parallel learning algorithms or more
effective architectures than the existing ones need to be
developed.
• How to choose sensible values for hyper-parameters such as
learning rate schedule, the number of layers and the number
of units per layer, etc.
22
Terminology
• Recurrent neural network: mạng nơ ron hồi quy,
sequential series: chuỗi tuần tự, vanishing gradient: độ
dốc triệt tiêu, exploding gradient: độ dốc bùng nổ, long-
term dependency: sự phụ thuộc dài hạn, LSTM block:
khối LSTM, LSTM cell: tế bào LSTM, cell state: trạng
thái tế bào, forget gate: cổng quên, input gate: cổng
nhập, output gate: cổng xuất, sigmoid function: hàm
sigmoid, tanh function: hàm tanh, time series
prediction: dự báo chuỗi thời gian, sliding window: cửa
sổ trượt, input vector: véc tơ đầu vào, epoch: kỷ
nguyên

Understanding LSTM Networks in Deep Learning
No ratings yet
Understanding LSTM Networks in Deep Learning
5 pages
RNNs and LSTMs in Deep Learning
No ratings yet
RNNs and LSTMs in Deep Learning
62 pages
Understanding Feedforward Neural Networks
No ratings yet
Understanding Feedforward Neural Networks
5 pages
Unit 4
No ratings yet
Unit 4
11 pages
LSTM and GRU in Deep Learning
No ratings yet
LSTM and GRU in Deep Learning
18 pages
Understanding LSTM Networks Explained
No ratings yet
Understanding LSTM Networks Explained
6 pages
Understanding LSTM in Deep Learning
No ratings yet
Understanding LSTM in Deep Learning
19 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
16 pages
RNNs and LSTMs: Understanding Mechanisms
No ratings yet
RNNs and LSTMs: Understanding Mechanisms
48 pages
Understanding LSTM Networks Explained
No ratings yet
Understanding LSTM Networks Explained
14 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
19 pages
LSTM Networks: Architecture and Applications
No ratings yet
LSTM Networks: Architecture and Applications
10 pages
Understanding LSTM Networks
No ratings yet
Understanding LSTM Networks
18 pages
Sequence Models in Deep Learning
No ratings yet
Sequence Models in Deep Learning
49 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
28 pages
Understanding LSTM Architecture and Applications
No ratings yet
Understanding LSTM Architecture and Applications
19 pages
RNN and LSTM Overview for CS-601
0% (1)
RNN and LSTM Overview for CS-601
16 pages
LSTM Model Overview and Applications
No ratings yet
LSTM Model Overview and Applications
17 pages
Simple CNN and RNN Model Overview
100% (3)
Simple CNN and RNN Model Overview
20 pages
Understanding RNNs and LSTMs
No ratings yet
Understanding RNNs and LSTMs
36 pages
Understanding LSTM Networks
No ratings yet
Understanding LSTM Networks
17 pages
LSTM Overview and Applications
No ratings yet
LSTM Overview and Applications
72 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
16 pages
LSTM No Forget Gate RMSE Trends
No ratings yet
LSTM No Forget Gate RMSE Trends
14 pages
RNNs and Sequence Modeling Techniques
No ratings yet
RNNs and Sequence Modeling Techniques
26 pages
Lecture 12 LSTM 23122022 064418pm
No ratings yet
Lecture 12 LSTM 23122022 064418pm
55 pages
RNNs in Deep Learning Applications
100% (2)
RNNs in Deep Learning Applications
53 pages
Understanding LSTM Gates and Equations
No ratings yet
Understanding LSTM Gates and Equations
22 pages
Understanding LSTM Architecture
No ratings yet
Understanding LSTM Architecture
12 pages
Understanding LSTM Networks
No ratings yet
Understanding LSTM Networks
12 pages
LSTM (Unit3)
No ratings yet
LSTM (Unit3)
6 pages
Architecture of LSTM
No ratings yet
Architecture of LSTM
5 pages
Understanding Long Short-Term Memory
No ratings yet
Understanding Long Short-Term Memory
25 pages
Understanding LSTM in Deep Learning
No ratings yet
Understanding LSTM in Deep Learning
60 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
144 pages
RNNs for Long Sequence Data Processing
100% (1)
RNNs for Long Sequence Data Processing
131 pages
Speech Emotion Classification with LSTM
No ratings yet
Speech Emotion Classification with LSTM
22 pages
Deep Learning Endsem
No ratings yet
Deep Learning Endsem
55 pages
Understanding LSTM and Gated RNNs
No ratings yet
Understanding LSTM and Gated RNNs
4 pages
LSTM Model for Time Series Prediction
No ratings yet
LSTM Model for Time Series Prediction
6 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
35 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
47 pages
Long Short-Term Memory Overview
No ratings yet
Long Short-Term Memory Overview
9 pages
RNN and LSTM for Time Series Forecasting
No ratings yet
RNN and LSTM for Time Series Forecasting
13 pages
Unit 2 DL
No ratings yet
Unit 2 DL
44 pages
Understanding LSTM Networks
No ratings yet
Understanding LSTM Networks
22 pages
LSTM Explained: A Simple Overview
No ratings yet
LSTM Explained: A Simple Overview
4 pages
RNN, LSTM, and GRU Architectures Explained
No ratings yet
RNN, LSTM, and GRU Architectures Explained
9 pages
LSTM Overview: Features & Applications
No ratings yet
LSTM Overview: Features & Applications
16 pages
Understanding LSTM Architecture
No ratings yet
Understanding LSTM Architecture
3 pages
RNNs and LSTMs Overview
No ratings yet
RNNs and LSTMs Overview
6 pages
LSTM Overview and Features
100% (1)
LSTM Overview and Features
23 pages
LSTM: Enhancing RNN for Long-Term Memory
No ratings yet
LSTM: Enhancing RNN for Long-Term Memory
18 pages
Understanding LSTMs - A Detailed Explanation
No ratings yet
Understanding LSTMs - A Detailed Explanation
2 pages
Sequence Modeling
No ratings yet
Sequence Modeling
33 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
14 pages
LSTM vs GRU: Key Differences Explained
No ratings yet
LSTM vs GRU: Key Differences Explained
22 pages
LSTM Networks: Architecture and Applications
No ratings yet
LSTM Networks: Architecture and Applications
36 pages
Naïve Bayes Classifier Explained
No ratings yet
Naïve Bayes Classifier Explained
22 pages
Unsupervised Learning: Clustering Methods
No ratings yet
Unsupervised Learning: Clustering Methods
54 pages
Deep Learning with Neural Networks
No ratings yet
Deep Learning with Neural Networks
54 pages
Support Vector Machines Overview
No ratings yet
Support Vector Machines Overview
52 pages
Guiding The Unseen A Systems Model of Prompt-Drive
No ratings yet
Guiding The Unseen A Systems Model of Prompt-Drive
21 pages
AI Course Notes for M.Sc. Students
No ratings yet
AI Course Notes for M.Sc. Students
20 pages
Paper Mill Control Room Decisions Webinar
No ratings yet
Paper Mill Control Room Decisions Webinar
36 pages
Intelligent Media Planning Essentials
No ratings yet
Intelligent Media Planning Essentials
7 pages
Arabic Sentiment Analysis on Twitter
No ratings yet
Arabic Sentiment Analysis on Twitter
9 pages
Overview of Key Computer Science Topics
No ratings yet
Overview of Key Computer Science Topics
3 pages
FANUC iPC Box for Advanced Robotics
No ratings yet
FANUC iPC Box for Advanced Robotics
2 pages
木头姐最新年度报告 Unlocked
No ratings yet
木头姐最新年度报告 Unlocked
164 pages
Adecco Thailand Salary Guide 2019
No ratings yet
Adecco Thailand Salary Guide 2019
55 pages
Bagging and Boosting in Data Mining
No ratings yet
Bagging and Boosting in Data Mining
8 pages
2025 AI Trends in Media & Entertainment
No ratings yet
2025 AI Trends in Media & Entertainment
14 pages
AI For Mathematics: Progress, Challenges, and Prospects: Corresponding Author
No ratings yet
AI For Mathematics: Progress, Challenges, and Prospects: Corresponding Author
45 pages
AI Chatbot for Workplace Well-Being
No ratings yet
AI Chatbot for Workplace Well-Being
8 pages
Technical Seminar Topics 2023-24
No ratings yet
Technical Seminar Topics 2023-24
11 pages
AI-Enhanced IELTS Speaking Practice
No ratings yet
AI-Enhanced IELTS Speaking Practice
49 pages
Virtual Press Conference Revolution
No ratings yet
Virtual Press Conference Revolution
26 pages
Streamlit Voice Input Automation
No ratings yet
Streamlit Voice Input Automation
30 pages
Credit Risk Analysis Using Machine Learning
No ratings yet
Credit Risk Analysis Using Machine Learning
6 pages
AI Transforming Agriculture Productivity
No ratings yet
AI Transforming Agriculture Productivity
4 pages
AI Theory Exam Paper 2021-22
No ratings yet
AI Theory Exam Paper 2021-22
2 pages
AI in Environmental Law Enforcement
No ratings yet
AI in Environmental Law Enforcement
22 pages
Waste Classification with Python AI
No ratings yet
Waste Classification with Python AI
14 pages
Advanced Enscape Tips for Users
No ratings yet
Advanced Enscape Tips for Users
29 pages
Fashion MNIST Image Classification Assignment
No ratings yet
Fashion MNIST Image Classification Assignment
2 pages
AI Impact on Stock Market Trading
No ratings yet
AI Impact on Stock Market Trading
1 page
Firstq Ai Whitepaper
No ratings yet
Firstq Ai Whitepaper
15 pages
Understanding Emerging Technology Trends
No ratings yet
Understanding Emerging Technology Trends
7 pages
Prompt Engineering for Enhanced Communication
100% (1)
Prompt Engineering for Enhanced Communication
176 pages
Understanding Personality in Careers
No ratings yet
Understanding Personality in Careers
5 pages
Crowdsourcing Machine Learning Insights
No ratings yet
Crowdsourcing Machine Learning Insights
24 pages

LSTM Calculations and Applications

Uploaded by

LSTM Calculations and Applications

Uploaded by

Chapter 12

Deep Neural Networks

• In Recurrent Neural Network,

Figure 12.18 (a) Feed-forward neural network (b) recurrent neural

• The internal operations of the recurrent cell in RNN

Figure 12.19 Internal operation of a traditional RNN cell.

• Assume that 𝑥 = 𝑥1 , 𝑥2 , … , 𝑥𝑡 represents a data

Figure 12.20 ReLU function 7

• The main block of LSTM is called cell state. LSTM

Figure 12.21 LSTM cell (block) 11

• Forget gate can control which elements of the cell

• The cell state at the previous time step, denoted as

• Output gate can determine which information from

• LSTM network contains more than one hidden layers. It

Figure 12.22 Architecture of LSTM network

Figure 12.23: Training LSTM for time

• Figure 12.23 illustrates one iteration step in the

EX: A time series : 2 3 5 4 6 8 5 7 11 13 9 7

You might also like