Algorithms: A Machine Learning View On Momentum and Reversal Trading
Algorithms: A Machine Learning View On Momentum and Reversal Trading
Article
A Machine Learning View on Momentum
and Reversal Trading
Zhixi Li and Vincent Tam *
Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam Road,
Hong Kong, China; [email protected]
* Correspondence: [email protected]; Tel.: +852-28592697
Received: 15 September 2018; Accepted: 24 October 2018; Published: 26 October 2018
Abstract: Momentum and reversal effects are important phenomena in stock markets. In academia,
relevant studies have been conducted for years. Researchers have attempted to analyze these
phenomena using statistical methods and to give some plausible explanations. However, those
explanations are sometimes unconvincing. Furthermore, it is very difficult to transfer the findings
of these studies to real-world investment trading strategies due to the lack of predictive ability.
This paper represents the first attempt to adopt machine learning techniques for investigating the
momentum and reversal effects occurring in any stock market. In the study, various machine learning
techniques, including the Decision Tree (DT), Support Vector Machine (SVM), Multilayer Perceptron
Neural Network (MLP), and Long Short-Term Memory Neural Network (LSTM) were explored and
compared carefully. Several models built on these machine learning approaches were used to predict
the momentum or reversal effect on the stock market of mainland China, thus allowing investors to
build corresponding trading strategies. The experimental results demonstrated that these machine
learning approaches, especially the SVM, are beneficial for capturing the relevant momentum and
reversal effects, and possibly building profitable trading strategies. Moreover, we propose the
corresponding trading strategies in terms of market states to acquire the best investment returns.
Keywords: stock market; machine learning; momentum effect; momentum trading; reversal effect;
reversal trading
1. Introduction
Momentum and reversal effects are common and interesting phenomena in stock markets.
The momentum effect means that the stocks that have performed well, i.e., given higher returns,
in the past (winners) will probably continue to outperform those that have performed poorly in the
past (losers) in the future. On the contrary, the reversal effect represents that the past losers may
convert to the winners in the future.
The reversal effect was first observed by [1], in which it was found that buying losers and selling
winners might acquire superior returns on the US stock market, because the US market easily overreacts
to some events, which results in abnormal price movements. The momentum effect, which claims that
buying winners and selling losers at the same time could earn significant positive returns over holding
periods of 3–12 months on the US stock market, was discovered by [2].
Up until recently, many relevant studies have been conducted. In addition to the US market,
researchers stated that stock markets in different regions have varying degrees of momentum and/or
reversal effect(s). For example, Reference [3] observed the momentum effect in the Latin American
emerging markets. Reference [4] found evidence of a substantial momentum effect in the China
Shanghai stock market over the period from 1995 to 2005. Reference [5] proposed a contrarian portfolio
strategy that could obtain profits on the Malaysian stock market based on the short-term reversal effect.
Reference [6] pointed out short-term reversal and mid-term momentum effects in weekly stock returns
in the European markets. Reference [7] presented profitable arbitrage strategies built on the short-term
reversal effect on the Hong Kong stock market.
On top of these observations, various studies [8–12] have been trying to explain the mechanisms
behind the effects. For instance, Reference [8] showed that the momentum effect may be correlated to
the past trading volume. Reference [9] concluded that the fundamental finance factors have important
links with the reversal effect for stocks traded on the Australian Stock Exchange. Reference [5] argued
that the market state has a strong relationship with the momentum effect on the Indian equity market.
In addition, some researchers have sought to explain the phenomena via behavioral finance models,
such as [11,12].
The existence of momentum and reversal effects have challenged the Efficient Markets Hypothesis
(EMH). In other words, investors may take the extra yield if they can predict which effect may happen
in the next market period. Unfortunately, the concluded results of most of the existing studies are
highly dependent on human experience and settings, e.g., within specific market observation and
holding periods. Their findings tend to be unrepeatable in other periods. As a result, the effects they
observed indeed existed in the past may disappear in the future. Similarly, the summarized factors
that explain the effects are not very robust. These links may not be persistent when applied to other
market periods. Thus, it is difficult to transfer these research outputs to real-world investment trading.
Nowadays, machine learning, as one of the most important approaches in artificial intelligence,
is a very hot research topic in academia as well as in industry. Many pieces of evidence report that
machine learning has been applied widely to diverse domains [13,14]. Machine learning is capable of
automatically recognizing potentially useful patterns in financial data [15].
The purpose of this paper is to propose the use of machine learning approaches instead of the
traditional statistical methods (e.g., the Causality Test and Hypothesis Test) that have been used in
previous studies to investigate the momentum and reversal effects on the stock market. To the best
of our knowledge, little research has applied machine learning to this problem. In this research, we
regard the problem as a supervised machine learning task. This paper presents several models built
on various popular machine learning approaches, including the Decision Tree (DT), Support Vector
Machine (SVM), Multilayer Perceptron Neural Network (MLP), and Long Short-Term Memory Neural
Network (LSTM), to learn historical data and to predict the effects in the next period. Among the
various machine learning approaches, the DT learning methods are designated to the construction
of decision trees to transform observations of each example/item to draw conclusions about the
targeted value of the relevant example/item. It is one of the most widely used predictive modeling
approaches for data mining, machine learning, and statistics. Besides, the SVM are supervised learning
models that are used in machine learning with associated algorithms to perform critical analyses
on the underlying data for classification or regression tests. The conventional SVM approach has
been extensively applied in many real-life applications including financial forecasting, image or voice
recognition [16,17], etc. Furthermore, the MLP and LSTM are neural network models that are mostly
used for time series prediction in numerous real-world applications, while the convolutional neural
network (CNN) approach is most commonly used to analyze the complex relationships between pixels
for image or video processing. Essentially, CNN uses a variation of the MLP to carry out minimal
preprocessing for the input image or video files. Recently, other research studies have tried to adapt the
CNN models for financial forecasting. On top of this, Reference [18] proposed an improved bacterial
chemotaxis optimization (IBCO) technique for integration into the back propagation neural network
to develop a more efficient forecasting model for stock prediction. Obviously, a diverse range of
trading strategies involving different machine learning approaches can be developed and thoroughly
evaluated. However, due to the limited resources and time at hand, we specifically consider several
basic and commonly used models of the DT, SVM, MLP, and LSTM approaches for our preliminary
investigation in this manuscript. In addition, it is worth noting that the testing data sets employed in
this research study include the China Securities Index 300 (CSI 300) as a capitalization-weighted stock
Algorithms 2018, 11, 170 3 of 16
market index to reflect the overall performance of China’s top 300 and most liquid A-share stocks
traded on the Shanghai and Shenzhen stock exchanges. The CSI 300 was carefully chosen as China is
one of the fast-growing stock markets with great volatility in the past.
In this paper, Section 2 presents the definition of the problem and proposed methods. Section 3
describes the experiment in detail. All the collected experimental results are thoroughly considered
and discussed in Sections 4 and 5. Finally, the concluding remarks are given in Section 6.
T+ J
ORiT = ∏ t = T +1 rit + 1 − 1
(1)
T + J +K
HRiT = ∏ t = T + J +1 rit + 1 − 1
(2)
In (1) and (2), ORiT is the total return of the ith stock in the observation period (J), while HRiT is
the total return of the ith stock over the holding period (K). rit is the daily return of the ith stock on the
ith transaction day. T represents the starting day of the observation period.
Stocks in a pre-defined asset pool may be ordered by their returns in the observation period.
The top N candidates with the highest returns are regarded as winners, whilst the top N with the
lowest returns are marked as losers. As for momentum trading, the winners in the observation period
(J) will be selected to build a portfolio, and then hold them until the end of the holding period (K).
On the contrary, the losers will be selected to build a portfolio for the reversal trading.
T+ J
∑iN=1 ∏t=T +1 rit + 1 − 1
T ∑iN=1 ORiT
RO = = (3)
N N
T + J +K
∑iN=1 ∏t=T + J +1 rit + 1 − 1
∑iN=1 HRiT
R TH = = (4)
N N
Thus, we may calculate the average returns of the portfolio (winners or losers) in J and K,
T and R T .
respectively, according to (3) and (4), i.e., RO H
In this research, we built prediction models using four proposed machine learning techniques
to predict the effect (momentum, reversal, or no effect) that may happen in the next holding period.
After that, corresponding strategies were generated based on these predicted signals.
As a result, DT can help to make financial decisions, i.e., betting on a momentum or reversal effect.
The C4.5 algorithm, an extension of ID3, was adopted in this research. Compared with ID3, the C4.5
algorithm can handle both continuous and discrete data features.
n 1 n n
max ∑i=1 αi −
2 ∑ i =1 ∑ j =1 i j i j
α α y y k xi , x j
s.t. 0 ≤ αi ≤ C, i ∈ [1, 2, . . . , n]
n
∑i=1 αi yi = 0. (5)
In (5), k xi , x j is the kernel function, while C is the penalty factor.
n
f ( x ) = sgn ∑i=1 ai∗ yi k(xi , x + β ∗ ), (6)
SVM can overcome overfitting problems [23]. Essentially, SVM uses the kernel function to
project the inputs into high-dimensional feature spaces so that SVM can efficiently solve non-linear
classification problems, as shown in (6). In this research, we selected the radial basis function (RBF) as
the kernel function, as described in (7).
In addition, there are variants of the SVM approach being applied to a diverse range of application
domains. Examples include the fuzzy SVM (FSVM) [24–26] and the twin SVM (TWSVM) [27,28].
As numerous industrial applications may contain fuzzy or noisy data, the FSVM tackles the relevant
fuzzy information of the underlying applications. In [25], a novel approach combining the wavelet
contour analysis for backbone detection, wavelet packet entropy, and FSVM for spine classification was
successfully applied and carefully studied. Moreover, another novel advanced fuzzy SVM (NA-FSVM)
method was proposed and used to predict the trends of stock prices. On the other hand, the TWSVM
approach intrinsically determines two nonparallel hyperplanes such that each hyperplane is closest to
one of the two classes yet as far as possible from another class. Essentially, the TWSVM targets two
smaller sized quadratic programming problems (QPPs) whereas the conventional SVM targets one
larger QPP. Thus, the TWSVM generally works faster than the conventional SVM approach.
this problem. Our MLP model is composed of five layers, i.e., an input layer, an output layer, and three
hidden dense layers.
Figure 3. Structure of the Long Short-Term Memory Neural Network (LSTM) unit from [30].
Similar to the ceaseless tuning conducted for the MLP model, our proposed LSTM model is
composed of one single input layer, followed by three LSTM layers and a dense output layer. Figure 4
illustrates the proposed topology of our LSTM model. The first layer is the input layer with the input
Algorithms 2018, 11, 170 6 of 16
shape (5, 90), i.e., the lookback step is set to 5 after tuning. The second layer is an LSTM layer with
the Relu activation function. The following third and fourth layers are LSTM layers with the Sigmoid
activation functions. The number of neurons in the hidden LSTM layers is 32. The final layer is a dense
layer that is used to output the classification result using the Softmax function.
3. Experiment Setup
the Price-Earnings Ratio (PE), the Price-to-Book Ratio (PB), the Price-To-Sales Ratio (PS), the Price Cash
Flow Ratio (PCF), etc.
In addition to the CSI 300 index, we put the past winners and losers into the momentum
(MOM) and reversal (REV) groups, respectively. Then, the above indicators together with some
mathematical statistics, such as the mean and standard deviation values, were calculated to generate
corresponding features.
Feature Sets
amplitude market_cap mom_ps rev_amplitude rev_roc
amplitude_std market_cap_std mom_ps_std rev_amplitude_std rev_roc_std
cci mom_amplitude mom_roc rev_cir_cap rev_turnover
change mom_amplitude_std mom_roc_std rev_cir_cap_std rev_turnover_std
cir_cap mom_cir_cap mom_turnover rev_current_rtn rev_yield_dispersion
cir_cap_std mom_cir_cap_std mom_turnover_std rev_current_rtn_std rev_yield_dispersion_std
close mom_current_rtn mom_yield_dispersion rev_lb roc
current_rtn mom_current_rtn_std mom_yield_dispersion_std rev_lb_std roc_std
current_rtn_std mom_lb obv rev_market_cap rsi
ema mom_lb_std open rev_market_cap_std sar
high mom_market_cap pb rev_pb sma
hurst mom_market_cap_std pb_std rev_pb_std turnover
kdj_slow_d mom_pb pcf rev_pcf turnover_std
kdj_slow_k mom_pb_std pcf_std rev_pcf_std vol_change
lb mom_pcf pe rev_pe volume
lb_std mom_pcf_std pe_std rev_pe_std willr
low mom_pe ps rev_ps yield_dispersion
macd mom_pe_std ps_std rev_ps_std yield_dispersion_std
3.4. Backtesting
In the experiment, we examined different observation and holding periods to investigate the
momentum and reversal effects.
At first, prediction models were built based on the above-proposed machine learning approaches.
Then, we conducted backtestings by different models for the testing data. We tried different
combinations of observation and holding periods for each model. Finally, the paper trading returns, as
indicated in (8), and the Sharpe ratios, as shown in (9), were calculated and compared carefully.
Pt
Rp = − 1, (8)
P0
where p0 . is the intial Net Asset Value (NAV) of the portfolio, whose value was set to 1.0 at the
beginning of time, while pt is the NAV at the end of time t. RiH is the daily return on the ith day,
while R f is the risk-free rate.
Since transaction costs are very important factors that may affect the investing return dramatically
in real-world trading, we had to take into account these costs in the backtesting for the better market
trading simulation. The transaction costs and risk-free rate are listed in Table 3.
4. Results
In addition to the NAV curve of the buy-and-hold strategy, Figure 9 puts all curves together so
that we can make comparisons across the best strategies produced by the machine learning models
and standalone momentum and reversal trading.
Figure 9. Portfolio performance comparison for the best strategies and models.
5. Discussion
124.32% and a Sharpe ratio of 0.85 for the reversal trading. These strategies bet the benchmark strategy,
i.e., the buy-and-hold strategy that had a return of 30.27% and a Sharpe ratio of 0.37.
However, the returns and Sharpe ratios for some periods were even worse than the benchmark.
This finding is similar to a lot of existing studies. For instance, the reversal trading acquired 124.32%
of the return for (J = 5, K = 15); however, the return of the momentum trading was −16.00%.
This observation is opposite to other cases, such as for (J = 15, K = 20). This suggests that the
momentum and reversal effects are much more sensitive to the selection of the observation and
holding periods.
As for the best candidate in each model, it is obvious that the best results obtained with each
machine learning model were much better than the benchmark as well as the standalone momentum
and reversal strategies. For example, the highest return with DT occurred for the case of (J = 15, K =
10). Its return reached 207.17%. Among all models, the SVM was the best with an averaged return of
66.48% and the highest return of 239.43% and a Sharpe ratio of 1.68 for the case of (J = 15, K = 10).
In fact, the measurements of the average and best performances are meaningful for real-world
trading. The investor can bet on the best strategy to acquire the highest potential return, and he or she
can allocate the capital to the strategies with different observation and holding periods to decrease
the risk.
These findings suggest we may adopt the LSTM in the fluctuating market, select any machine
learning model in the bull market, and change to the SVM to avoid a great loss when a market crash
is coming.
6. Conclusions
In summary, this research represents the first attempt to disclose and understand the momentum
and reversal effects in the stock market through machine learning techniques. We investigated
various machine learning approaches, built corresponding trading strategies, and conducted
relevant backtestings.
The experimental results verify that the reversal effect tends to occur in the CSI 300 stock market.
By comparing the backtesting results, it has been shown that machine learning approaches were
helpful for building more profitable trading strategies. The overall performance beat the benchmark as
well as the standalone momentum and reversal trading. Furthermore, we proposed corresponding
trading strategies in terms of market states, i.e., LSTM for the fluctuating market state and SVM for the
crashing market state.
Up until now, few studies have conducted this type of research. Our research provides a new
horizon for the study of momentum and reversal effects on the stock market. It could be beneficial for
individual investors building strategies to obtain excess returns from the market. In addition, it is very
applicable to algorithmic trading for institutional investors.
Clearly, there is much future work to be carried out. Firstly, the macro-economical indicators
and sentiment data extracted from online social networks could be taken into account as features.
Secondly, the volatility index (such as VIX for the US stock market, VHSI for Hong Kong stock market)
is a powerful tool for measuring and even predicting the current and future volatilities of the market.
It would be a pioneer work to combine the volatility indexes with the current work, and this could
help the model to analyze and predict the market states more accurately, Thirdly, the selection of
observation and holding periods could be investigated more carefully. It would be great if the machine
learning model could build an adaptive framework. Furthermore, would definitely be interesting to
investigate how the different variants of SVM, such as the fuzzy or twin SVM, or the convolutional
neural networks could be adapted for financial forecasting in future studies. Last but not least, it would
be worth creating more intelligent and comprehensive models or frameworks ensembling various
machine learning models to accommodate complicated market scenarios.
Author Contributions: Z.L. designed the models, conducted the experiment, and drafted the manuscript;
V.T. supervised the research, provided comments, and revised and finalized the manuscript.
Funding: This research received no external funding.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Bondt, W.F.M.; Thaler, R. Does the stock market overreact? J. Financ. 1985, 40, 793–805. [CrossRef]
2. Jegadeesh, N.; Titman, S. Returns to buying winners and selling losers: Implications for stock market
efficiency. J. Financ. 1993, 48, 65–91. [CrossRef]
3. Muga, L.; Santamaría, R. The momentum effect in latin american emerging markets. Emerg. Mark.
Financ. Trade 2007, 43, 24–45. [CrossRef]
4. Naughton, T.; Truong, C.; Veeraraghavan, M. Momentum strategies and stock returns: Chinese evidence.
Pac. Basin Financ. J. 2008, 16, 476–492. [CrossRef]
5. Hameed, A.; Ting, S. Trading volume and short-horizon contrarian profits: Evidence from the malaysian
market. Pac. Basin Financ. J. 2000, 8, 67–84. [CrossRef]
6. Hhhn, H.; Scholz, H. Reversal and momentum patterns in weekly stock returns: European evidence.
SSRN Electron. J. 2017. [CrossRef]
Algorithms 2018, 11, 170 15 of 16
7. Tang, G.Y.N.; Zhang, H. Stock return reversal and continuance anomaly: New evidence from Hong Kong.
Appl. Econ. 2014, 46, 1335–1349. [CrossRef]
8. Connolly, R.; Stivers, C. Momentum and reversals in equity-index returns during periods of abnormal
turnover and return dispersion. J. Financ. 2003, 58, 1521–1556. [CrossRef]
9. Ramiah, V.; Li, D.L.; Carter, J.; Seetanah, B.; Thomas, S. Explaining Contrarian Profits with Finance Fundamentals.
2016. Available online: https://www.researchgate.net/profile/Vikash_Ramiah/publication/265821407_
Explaining_Contrarian_Profits_with_Finance_Fundamentals/links/54e40d900cf2dbf60695661a/Explaining-
Contrarian-Profits-with-Finance-Fundamentals.pdf (accessed on 25 October 2018).
10. Maheshwari, S.; Dhankar, R.S. Market state and investment strategies: Evidence from the indian stock
market. IIM Kozhikode Soc. Manag. Rev. 2018, 7, 154–170. [CrossRef]
11. Makarov, I.; Rytchkov, O. Forecasting the forecasts of others: Implications for asset pricing. J. Econ. Theory
2012, 147, 941–966. [CrossRef]
12. Conrad, J.; Yavuz, M.D. Momentum and reversal: Does what goes up always come down? Rev. Financ. 2017,
21, 555–581. [CrossRef]
13. Nasrabadi, N.M. Pattern recognition and machine learning. J. Electron. Imaging 2007, 16, 049901.
14. Russell, S.J.; Norvig, P. Artificial Intelligence: A Modern Approach; Pearson Education Limited: London, UK, 2016.
15. Li, Z.; Tam, V.; Yeung, L. Combining cloud computing, machine learning and heuristic optimization
for investment opportunities forecasting. In Proceedings of the 2016 IEEE Congress on Evolutionary
Computation (CEC), Vancouver, BC, Canada, 24–29 July 2016; pp. 3469–3476.
16. Byun, H.; Lee, S.-W. Applications of Support Vector Machines for Pattern Recognition: A survey. In Pattern
Recognition with Support Vector Machines; Springer: Berlin/Heidelberg, Germany, 2002; pp. 213–236.
17. Kim, K.-J. Financial time series forecasting using support vector machines. Neurocomputing 2003, 55, 307–319.
[CrossRef]
18. Zhang, Y.; Wu, L. Stock market prediction of s&p 500 via combination of improved bco approach and bp
neural network. Expert Syst. Appl. 2009, 36, 8849–8854.
19. Guenther, N.; Schonlau, M. Support vector machines. Stata J. 2016, 16, 917–937.
20. Zheng, B.; Yoon, S.W.; Lam, S.S. Breast cancer diagnosis based on feature extraction using a hybrid of
k-means and support vector machine algorithms. Expert Syst. Appl. 2014, 41, 1476–1482. [CrossRef]
21. Jung, H.C.; Kim, J.S.; Heo, H. Prediction of building energy consumption using an improved real coded
genetic algorithm based least squares support vector machine approach. Energy Build. 2015, 90, 76–84.
[CrossRef]
22. Li, Z.; Tam, V. A comparative study of a recurrent neural network and support vector machine for predicting
price movements of stocks of different volatilites. In Proceedings of the 2017 IEEE Symposium Series on
Computational Intelligence (SSCI), Honolulu, HI, USA, 27 November–1 December 2017; pp. 1–8.
23. Masnadi-Shirazi, H.; Vasconcelos, N. Risk minimization, probability elicitation, and cost-sensitive svms.
In Proceedings of the ICML, Haifa, Israel, 21–24 June 2010; pp. 759–766.
24. Lin, C.-F.; Wang, S.-D. Fuzzy support vector machines. IEEE Trans. Neural Netw. 2002, 13, 464–471. [PubMed]
25. Wang, S.; Chen, M.; Li, Y.; Zhang, Y.; Han, L.; Wu, J.; Du, S. Detection of dendritic spines using
wavelet-based conditional symmetric analysis and regularized morphological shared-weight neural
networks. Comput. Math. Methods Med. 2015, 454076. [CrossRef] [PubMed]
26. Wang, S.; Li, G.; Bao, Y. A novel improved fuzzy support vector machine based stock price trend forecast
model. arXiv, 2018; arXiv:1801.00681.
27. Ding, S.; Yu, J.; Qi, B.; Huang, H. An overview on twin support vector machines. Artif. Intell. Rev. 2014,
42, 245–252. [CrossRef]
28. Shao, Y.-H.; Zhang, C.-H.; Wang, X.-B.; Deng, N.-Y. Improvements on twin support vector machines.
IEEE Trans. Neural Netw. 2011, 22, 962–968. [CrossRef] [PubMed]
29. Faghfouri, A.E.; Frish, M.B. Robust discrimination of human footsteps using seismic signals. In Proceedings
of the Unattended Ground, Sea, and Air Sensor Technologies and Applications XIII, Orlando, FL, USA,
25–29 April 2011; International Society for Optics and Photonics: Bellingham, WA, USA, 2011; p. 80460D.
30. Tong, X.; Sun, S. Long Short-Term Memory Network for Wireless Channel Prediction; Springer: Singapore, 2018;
pp. 19–26.
Algorithms 2018, 11, 170 16 of 16
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).