0% found this document useful (0 votes)
87 views10 pages

Vietnam Stock Prediction with LSTM

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views10 pages

Vietnam Stock Prediction with LSTM

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Stock Price Prediction in Vietnam Using Stacked

LSTM

Nguyen Trung Tuan1 , Thu Hang Nguyen2(B) , and Thanh Thi Hien Duong2
1 National Economics University, Hanoi, Vietnam
[email protected]
2 Hanoi University of Mining and Geology, Hanoi, Vietnam

{nguyenthuhang,duongthihienthanh}@humg.edu.vn

Abstract. The direction of the stock market is always complex, stochastic, and
highly volatile. In addition to traditional forecasting models such as linear regres-
sion and Automatic Regression Integrated Moving Average (ARIMA) models,
analysts are now trying to apply modern deep learning models to predict trends
direction of the stock market to achieve more accurate forecasting. In this conduct-
ing research, we have investigated and applied the state-of-the-art deep learning
sequential model, namely the Stacked Long Short-Term Memory Model (Stacked
LSTM) to the prediction of stock prices the next day. The experimental result on
three benchmark datasets: stocks of Apple Inc. (AAPL), stocks of An Phat Bio-
plastic JSC (AAA), and stocks of Bank of Foreign Trade of Vietnam (VCB) has
shown the effectiveness of the predictive model. Furthermore, we discovered that
the suitable quantity of hidden layers is two, and when we continue to increase
the quantity of hidden layers to three or four, the Stacked LSTM model does not
improve the predictive power, even though it has a more complex model structure.

Keywords: Stock price prediction · Stock forecasting · Time series forecasting ·


Stacked LSTM

1 Introduction
Stock prices have been a popular topic in the modern economy. In the stock market,
because of high volatility, the stock price prediction has an important impact on the
decisions in trading and investing. Hence, stock price forecasting has been an interesting
field for researchers, traders, investors, and corporate.
Financial time series with flexible and complex variables are not easy for forecasting.
Data of the stock market are numerous and almost nonlinear, so we must develop models
which can dissect many hidden layers. Traditional statistical analysis models have been
widely used for a long time in economics and finance data analysis, including stock
forecasting problems, such as exponential smoothing and Autoregressive Integrated
Moving Average (ARIMA) [1–3]. Recent studies have shown that deep learning models
are capable of exploring hidden and dynamics patterns in the data by learning themselves,
so these models can give better forecasting results than statistical [4].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


N.-T. Nguyen et al. (Eds.): ICIT 2022, LNDECT 148, pp. 246–255, 2022.
https://doi.org/10.1007/978-3-031-15063-0_23
Stock Price Prediction in Vietnam Using Stacked LSTM 247

Furthermore, Recurrent neural networks (RNN) are powerful types of neural net-
works designed to resolve sequence dependence. RNN uses not only the input data, but
also the previous outputs for predicting the current output, so it is considered the net-
work for sequential data. RNN association of the nodes makes a directed diagram and
internal memory themselves are used to deal with flexible input sequences. The state of
each node is time-varying by the real-valued activation function. The learning model in
RNN is determined by transition between states so it always has the same input size.
Otherwise, the same transaction function which has the same parameters at each step
was used in the system [4].
A type of RNN is the Long Short-Term Memory network (LSTM) which has large
structures and is used to train data with a successful outcome. An issue RNN faces
are having vanishing and exploding gradients. To address these issues, LSTM networks
operate “computational gates” which help handle data and keep more needed informa-
tion. Hence, LSTM models have outperformed with time-series data, compared to other
sequences [5]. Furthermore, the price forecasting in the stock market has been imple-
mented with more highly accurate by using LSTM networks. Di Persio and Honchar
performed three kinds of RNNs, namely: a basic RNN, an LSTM, and a Gated Recurrent
Network (GRU), and included that the accurate results of the other RNNs were not good
as the LSTM which was at 72% [6]. Pang et al. suggested two LSTM models to forecast
the stock market: one had an embedding layer, and another had an autoencoder. The
outcomes of the LSTM with embedding are more quality, with the result of accuracy
being 57.2, compared to 56.9% of another [7].
There are some crucial factors to improve the performance of DNN architecture for
prediction. Hiransha et al. showed that large of data is the first factor according to the
bigger of the quantity of data, the higher the quality result of the model [4]. Otherwise,
Adding or reducing the quantity of hidden layers is also affected to perform of models.
The research of Karsoliya confirmed that the model can be had issues in training after
the fourth layer. Moreover, the size of hidden notes in each layer can be followed by
the thumb regulations that the hidden layer has a quantity of nodes is 2/3 the quantity
of nodes in the input layer [8]. Hossain et al. performed two models LSTM and GRU
for prediction. First, the features were passing the LSTM to work for forecasting. And
then this forecast result was got to the GRU model to perform one more forecasting.
The outcome of this model was better than the performance of the LSTM model or
GRU model when they worked independently [9]. In recent years, combining models
for forecasting have become popular, so the optimization of each model is considered.
Assunta et al. predicted stock prices by implementing the multilayer perceptron (MLP)
model. It found that in each layer, the optimization of the size of hidden layers and
hidden nodes performed differently in each case and must be investigated through trial
and error [10].
Many models have been used to predict stock price, but LSTM still has been one
of the most common choices for experiments with successful results. Motivated by this
trend, in our study, we investigate the Stacked LSTM models by considering the number
of hidden layers, hidden nodes in each layer, and size of data aiming to find out the
optimized model for forecasting stock price. The models have been verified with three
datasets collected from the daily stock prices in the past of three companies, namely:
248 N. T. Tuan et al.

Apple Inc. (AAPL), An Phat Bioplastic JSC (AAA), and Bank of Foreign Trade of
Vietnam (VCB).
The structure of the paper is arranged as follows. In Sect. 2, we introduce brief
presentations of the LSTM and then the Stacked LSTM model - the model that we
will optimize architecture for stock price prediction. We then provide some empirical
analysis to point out the params settings for the Stacked LSTM model to achieve the
best prediction outcomes in the next part - Sect. 3. Finally, the last section - Sect. 4 is a
conclusion.

2 Methodology

2.1 Long Short-Term Memory

A particular model of RNN is the LSTM network which introduces an internal cell
state or memory state and gating mechanisms. It can retain short-term memory while
capturing long-range dependencies in data [11]. The LSTM models were applicated in
the financial domain [9, 10], sequence learning domain [13] and have achieved a lot of
results.
Each cell of LSTM works with gates: Input gate (it ), output gate (ot ) and forget gate
(ft ) (see in Fig. 1) [5, 14, 15].

Fig. 1. LSTM cell [14, 12]

The forget gate will get which candidate data from the previous cell state and clear
out. It takes the inputs and output of the previous hidden state (ht−1 ), and input of current
state (xt ), and input them through the sigmoid activation function (σ ) which each value
output is a vector between 0 and 1. The output is 0 means that the information is removed
while 1 indicates that the information is kept.
The input gate withanother
 sigmoid layer will choose the value to update in the cell
state. A memory cell C̃t is also created which uses tanh activation function on the
same inputs, but the output is between −1 and 1. The information from the cell state
is dropped with the negative result and is added with the positive value. The result was
Stock Price Prediction in Vietnam Using Stacked LSTM 249

defined by how much each cell state should be updated, and times the output from tanh
and input gate sigmoid activation:
   
it = σ Wi . ht−1 , xt + bi (1)

   
ft = σ Wf . ht−1 , xt + bf (2)

   
t = tanh WC . ht−1 , xt + bC
C (3)

Finally, a new cell state was obtained by the forget vector (ft ) times the previous cell
state (Ct−1 ) and the result is added with the multiplication between input gate and tanh
vector:
t
Ct = ft ∗ Ct−1 + it ∗ C (4)

The output gate (ot ) (a sigmoid layer) will be used to filter output for the next step,
and return to the hidden state (ht ):
   
ot = σ Wo . ht−1 , xt + bo (5)

ht = ot ∗ tanh(Ct ) (6)

2.2 Stacked LSTM


The foundational LSTM model includes only one hidden layer. The network was
expanded by adding hidden LSTM layers called Stacked LSTM. Each hidden layer of
this network includes multiple cell states which are piled on top of each other. Further-
more, we receive one outcome after each input time step through each layer. Therefore,
each hidden layer of LSTM can have a sequence output for all input time steps, instead
of one output like other models [16].
A Multilayer Perceptron becomes deeper when more hidden layers are added. The
learned performances through training process from previous layers were collected by the
expanding hidden layer and create a new performance of networks which was developed
to high levels of abstraction. Hence, the approach for prediction was improved with more
accuracy based on the multi-layer model [17, 18].
Nowadays, Stacked LSTM is a strong method to handle the complex sequence data
for forecasting as of result of outperforming [19, 20].

3 Experiments and Results


3.1 Datasets
The conducting research was done for three datasets of stock price, namely: AAPL stock
was obtained from Yahoo Finance which is used widely in research for forecasting stock
price; AAA stock and VCB stock which were collected from Vietstock [21]. The daily
250 N. T. Tuan et al.

history prices of AAA stock and VCB stock are from 5 January 2015 to 31 December
2021, including 1744 and 1749 data points, respectively. AAPL stock is between 31
January 2015 to 29 December 2019, with 1236 data points. The attribute categories of

Fig. 2. Stock price performance: (a) AAPL stock price, (b) AAA stock price, (c) VCB stock price.
Stock Price Prediction in Vietnam Using Stacked LSTM 251

these data have six types, namely: Open, High, Low, Close, Adj Close, Volume (see in
Fig. 2).

3.2 Parameter Settings and Evaluation Metrics

We use Keras library for Python to implement DNN model and Jupyter Notebook was
used to conduct the analysis. To define the LSTM dataset, the data of the model was
divided into two sets: the training data was indicated for 80% of the dataset and the test
data was set for the rest with 20%. The quantity of layers in the Stacked LSTM model
was designed from 2 to 4 hidden layers. A dropout layer was set before the Dense layer to
avoid overfitting. The regulation of thumb method was applied so the quantity of nodes
in the hidden layers is 2/3 of the size of the input layer [8]. The detail of the quantity of
LSTM layers for each case is shown in Table 1.

Table 1. The quantity of layers in the stacked LSTM model

Quantity of LSTM layers Nodes of LSTM layers in order


2 (45, 30)
3 (67, 45, 30)
4 (100, 67, 45, 30)

While training data, we used mean squared error (MSE) which is popular and applied
the most as a loss function. Besides that, the Stacked LSTM network was optimized by
applying the Adam algorithm [14, 20, 22]. Furthermore, with big data, the model was
performed with 30 batches, 1000 epochs, and put sliding window size was 7 days.
Finally, the prediction models are evaluated by three used widely metrics in the
research community, which are Root Mean Squared Error (RMSE), Mean Absolute
Error (MAE) and Mean Absolute Percentage Error (MAPE) where the lower value the
better. Let N be the number of samples, xt and xt are the ground-truth and prediction
results, respectively, the evaluation metrics RMSE, MAE, and MAPE are defined by
Eqs. 7, 8, and 9, respectively [19, 23].

  1 N  2
RMSE x, x̂ = xt − x̂t (7)
N t=1

  1 N
MAE x, x̂ = xt − x̂t (8)
N t=1

  1 N xt − x̂t
MAPE x, x̂ = (9)
N t=1 xt
252 N. T. Tuan et al.

3.3 Results and Discussion


The outcomes of evaluation metrics from Table 2 to Table 4 showed that the case with 2
LSTM layers in all models had the best performance with the lowest error measures. In
the prediction models of AAPL stock prices, the RMSE is 0.725, the MAE is 0.503 and
MAPE is 0.009. AAA stock prices with RMSE, MAE, and MAPE are 0.285, 0.219, and
0.014, respectively. The results of VCB stock are 1.793, 0.976, and 0.010 for RMSE,
MAE, and MAPE, respectively. The prediction graphs for these works are displayed in
Fig. 3. Otherwise, the value of error measures was increased when we added more the
number of LSTM layers to 3 or 4. In the case of both AAPL and VCB stock prices, the
error measures of 4 LSTM layers were about double compared with 3 LSTM layers.
Compared with other LSTM works, our performance has an important contribution
that we can improve the model by optimizing the parameters which can make the model
become the most efficient only with two hidden layers of architecture.

Table 2. The performance of evaluation metrics for AAPL stock predictions

Quantity of LSTM layers Metrics


RMSE MAE MAPE
2 0.725 0.503 0.009
3 1.280 0.949 0.017
4 2.695 1.464 0.024

Table 3. The performance of evaluation metrics for AAA stock predictions

Quantity of LSTM layers Metrics


RMSE MAE MAPE
2 0.285 0.219 0.014
3 0.380 0.278 0.018
4 0.352 0.261 0.016

Table 4. The performance of evaluation metrics for VCB stock predictions

Quantity of LSTM layers Metrics


RMSE MAE MAPE
2 1.793 0.976 0.010
3 1.994 1.112 0.012
4 4.063 3.078 0.031
Stock Price Prediction in Vietnam Using Stacked LSTM 253

Fig. 3. Performance of stock price prediction for 2 LSTM layers: (a) AAPL stock price, (b) AAA
stock price, (c) VCB stock price
254 N. T. Tuan et al.

4 Conclusion
In conducting research on stock price prediction, we recommend a neural network based
on optimizing Stacked LSTM architecture. One of the approaches to improve perfor-
mance is considering specific factors which are the quantity of data, the ratio of dataset
divided into training and test set, the quantity of LSTM layers, and the nodes of each
LSTM layer. Our implementation showed that we can achieve better outcomes after only
2 LSTM layers. The model has a lot of data that need to be trained so this performance
helps to save the training time and complexity. Furthermore, our models were tested on
three stock prices datasets of Apple Inc. (AAPL), An Phat Bioplastic JSC (AAA), and
Bank of Foreign Trade of Vietnam (VCB), so it will help the traders and investors have
more useful information to make good decisions and get more profit in Vietnam stock
market. In future works, we will extend more crucial features that affect the stock price
and could achieve real-time stock market forecasting in Viet Nam.

Acknowledgements. This work was funded by Hanoi University of Mining and Geology under
grant number 65/QD-MDC.

References
1. Khashei, M., Hajirahimi, Z.: A comparative study of series arima/mlp hybrid models for stock
price forecasting. Commun. Stat. – Simul. Comput. 48(9), 2625–2640 (2019). https://doi.org/
10.1080/03610918.2018.1458138
2. Kumar, M., Thenmozhi, M.: Forecasting stock index returns using ARIMA-SVM, ARIMA-
ANN, and ARIMA-random forest hybrid models. IJBAAF 5(3), 284 (2014). https://doi.org/
10.1504/IJBAAF.2014.064307
3. Wadi, S.A., Almasarweh, M., Alsaraireh, A.A.: Predicting closed price time series data using
ARIMA Model. MAS 12(11), 181 (2018). https://doi.org/10.5539/mas.v12n11p181
4. Hiransha, M., Gopalakrishnan, E.A., Menon, V.K., Soman, K.P.: NSE stock market prediction
using deep-learning models. Procedia Comput. Sci. 132, 1351–1362 (2018). https://doi.org/
10.1016/j.procs.2018.05.050
5. Olah, C.: Understanding lstm networks (2015). https://colah.github.io/posts/%202015%E2%
80%9308-Understanding-LSTMs/
6. Di Persio, L., Honchar, O.: Analysis of recurrent neural networks for short-term energy load
forecasting. Thessaloniki, Greece, p. 190006 (2017). https://doi.org/10.1063/1.5012469
7. Pang, X., Zhou, Y., Wang, P., Lin, W., Chang, V.: An innovative neural network approach for
stock market prediction. J. Supercomput. 76(3), 2098–2118 (2018). https://doi.org/10.1007/
s11227-017-2228-y
8. Karsoliya, S.: Approximating number of hidden layer neurons in multiple hidden layer BPNN
architecture. Int. J. Eng. Trends Technol. 3(6), 4 (2012)
9. Hossain, M.A., Karim, R., Thulasiram, R., Bruce, N.D.B., Wang, Y.: Hybrid deep learning
model for stock price prediction. In: 2018 IEEE Symposium Series on Computational Intel-
ligence (SSCI), Bangalore, India, pp. 1837–1844 (2018). https://doi.org/10.1109/SSCI.2018.
8628641
10. Salleh, H., Atrick Vincent, A.M.P.: An investigation into the performance of the multilayer
perceptron architecture of deep learning in forecasting stock prices. UMT JUR, vol. 3, no. 2,
pp. 61–68 (2021). https://doi.org/10.46754/umtjur.2021.04.006
Stock Price Prediction in Vietnam Using Stacked LSTM 255

11. Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and
problem solutions. Int. J. Unc. Fuzz. Knowl. Based Syst. 06(02), 107–116 (1998). https://doi.
org/10.1142/S0218488598000094
12. Heaton, J.B., Polson, N.G., Witte, J.H.: Deep learning for finance: deep portfolios. Appl.
Stochast. Models Bus. Ind. 33(1), 3–12 (2017). https://doi.org/10.1002/asmb.2209
13. Luong, M.-T., Sutskever, I., Le, Q.V., Vinyals, O., Zaremba, W.: Addressing the rare word
problem in neural machine translation (2015). http://arxiv.org/abs/1410.8206. Accessed 1
Mar 2022
14. Aryal, S., Nadarajah, D., Rupasinghe, P.L., Jayawardena, C., Kasthurirathna, D.: Comparative
analysis of deep learning models for multi-step prediction of financial time series. J. Comput.
Sci. 16(10), 1401–1416 (2020). https://doi.org/10.3844/jcssp.2020.1401.1416
15. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780
(1997). https://doi.org/10.1162/neco.1997.9.8.1735
16. Brownlee, J.: Stacked long short-term memory networks (machinelearningmas-tery.com),
18 August 2017. https://machinelearningmastery.com/stacked-long-short-term-memory-net
works/
17. Pascanu, R., Gulcehre, C., Cho, K., Bengio, Y.: How to construct deep recurrent neu-ral
networks (2014). http://arxiv.org/abs/1312.6026. Accessed 24 Feb 2022
18. Graves, A., Jaitly, N., Mohamed, A.: Hybrid speech recognition with deep bidirectional
LSTM. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding,
Olomouc, Czech Republic, pp. 273–278 (2013). https://doi.org/10.1109/ASRU.2013.670
7742
19. Koenecke, A.: Applying deep neural networks to financial time series forecasting. Institute
for Computational & Mathematical Engineering, Stanford, California, USA (2020). https://
web.stanford.edu/~koenecke/files/Deep_Learning_for_Time_Series_Tutorial.pdf
20. Al Ridhawi, M.: Stock market prediction through sentiment analysis of social-media and
financial stock data using machine learning (2021). https://doi.org/10.20381/RUOR-27045
21. “Vietstock” Vietstock. https://vietstock.vn/
22. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2017). http://arxiv.org/
abs/1412.6980. Accessed 02 Mar 2022
23. Terna, P.: A Deep Learning Model to Forecast Financial Time-Series (2015)

You might also like