Comparison of Some Statistical Forecasting Techniques With GMDH Predictor: A Case Study
Comparison of Some Statistical Forecasting Techniques With GMDH Predictor: A Case Study
Abstract: Demand forecasts are extremely important for manufacturing industry and also needed for all type of
business and business suppliers for distribution of finish products to the consumer on time. This study is
concerned with the determination of accurate models for forecasting cement demand. In this connection this
paper presents results obtained by using a self-organizing model and compares them with those obtained by usual
statistical techniques. For this purpose, Monthly sales data of a typical cement ranging from January, 2007 to
February, 2016 were collected. A nonlinear modelling technique based on Group Method of Data Handling
(GMDH) is considered here to derive forecasts. Forecast were also made by using various time series smoothing
techniques such as exponential smoothing, double exponential smoothing, moving average, weightage moving
average and regression method. The actual data were compared to the forecast generated by the time series model
and GMDH model. The mean absolute deviation (MAD, mean absolute percentage error (MAPE) and mean
square error (MSE) were also calculated for comparing the forecasting accuracy. The comparison of modelling
results shows that the GMDH model perform better than other statistical models based on terms of mean absolute
deviation (MAD), mean absolute percentage error (MAPE) and mean square error (MSE).
ܻሺݔଵ , … … . , ݔ ሻ
and pattern recognition of complex systems. There are
processes for which it is needed to know their future or
= ܽ + ܽ ݔ + ා ܽ ܺ ܺ
to analyze inter-relations.
The purpose of the study is to determine accurate
model for demand forecasting of cement. For this ୀଵ ୀଵ
ୀଵ
secondary sales data of cement have been collected
and forecasts have been made by applying different
+ ා ܽ ܺ ܺ ܺ
time series techniques. GMDH method has also been
used to derive the forecast.The mean absolute
ୀଵ
percentage error (MAPE) and mean square error ୀଵ
ୀଵ
+ … … … ..
(MSE) have been calculated for comparing the
forecasting accuracy among different techniques.
ێ . ۑ
X=ێ ۑ
exponential smoothing. One method sometimes
.
ێ ۑ
referred to as "Holt-Winters double exponential
.
ێ ۑ
smoothing are followed here. One of two smoothing
ێ . ۑ
factor is α which is called data smoothing factor and
ێ1 ݔெ ݔெ ݔெ ݔெ ݔெ ݔெ ۑ
ଶ ଶ
it’s value, 0 < α < 1, and the other one β is the trend
smoothing factor, 0 < β < 1. We also used the GMDH
predictor version GMDH Data Science 3. 5. 9 to
derive the forecast. Out of 110 data 58 months data are
used for the training set and rest of the data are used M is the number of observations in the training
for evaluation in checking set. set.
The GMDH method was originally formulated to 2.1 Measurement of Forecasting Error
solve for higher order regression polynomials In order to evaluate the forecasting accuracy of
especially for solving modelling and classification different techniques various central tendency
problem. General connection between inputs and measures as the loss function were also calculated.
output variables can be expressed by a complicated
polynomial series in the form of the Volterra series, Mean Absolute Deviation
known as the Kolmogorov-Gabor polynomial: A common method for measuring overall
forecast error is the mean absolute deviation. Heifer
and Render (2001) noted that this value is computed
by dividing the sum of the absolute values of the
individual forecast error by the sample size (the
28000
Mean Squared Error (MSE)
In statistics, the mean squared error (MSE) of 26000
an estimator measures the average of the squares of 24000
Sales Volume
the "errors", that is, the difference between the 22000
estimator and what is estimated. MSE is a risk 20000
function, corresponding to the expected value of the
18000
squared error loss or quadratic loss. The difference
occurs because of randomness or because the 16000
the complete characteristic of the nonlinear system.
MAPE = ∗ 100% Step 2: Select ൫ ൯ = mሺm − 1ሻ/2 new input
ଵ |ୡ୲୳ୟ୪ି୭୰ୣୡୟୱ୲|
୬ ୡ୲୳ୟ୪ ଶ
ୀଵ
variables according to all possibilities of connection
by each pair of inputs in the layer. Construct the
regression polynomial for this layer by forming the
DATA COLLECTION AND ANALYSIS
quadratic expression which approximates the output y
Data Collection is a significant aspect of any type of in equation (1).
research study. The data used in this case study are
of these൫ ൯ input variables, according to the value of
monthly sales data of cement. The data span the period Step 3: Identify the single best input variable out
from January 2007 to February 2016. The time series ଶ
plot is given Fig. 1. mean square error (MSE). The input of variables that
give the best results in the first layer, are allowed to
Set the new input ሺݔଵ ݔଶ ݔଷ ݔସ … … … … ݔெ ሻ and
form second layer candidate model of the equation (1).
carried out as long as the MSE for the test data set 3 it is clear that the model with the minimum value of
decrease compared with the value obtained at the the MSE is the GMDH model.
previous one as shown in Fig. 2. After the best models
of each layer have been selected, the output model is Table 2. MAPE of different forecasting methods
selected by the MSE. The model with the minimum
value of the MSE is selected as the output model15. Method MAPE
3 month Moving Average 11%
6 month Moving Average 14%
12 month Moving Average 11%
Weightage Moving Average 10%
Regression 12%
MSE
GMDH Method 4%
Exponential α=0.3 11%
Iterations
Table 3. MSE of different forecasting methods
Figure 2. Stopping criteria of GMDH algorithm
Method MSE
3 Month Moving Average 7994519
3.2 Analysis by Statistical method 6 Month Moving Average 10355301
Various time series smoothing techniques such as 12 Month Moving Average 7710194
exponential smoothing, double exponential smoothing,
moving average and regression method were used for Weightage Moving Average 6291543
forecasting the load demand. Absolute deviations Regression 9177720
were also calculated. The mean absolute deviations GMDH Method 824882
(MADs) found from these calculations are listed in
table 1. Exponential α= 0.3 7619269
Exponential α= 0.5 6220179
Double Exponential α= 0.3, β= 0.5 11913465
Table 1. MAD of different forecasting methods
Method MAD
3 month Moving Average 2306
RESULTS AND DISCUSSION
6 month Moving Average 2791
12 month Moving Average 2230 After completing data analysis we have come out
with some informative results. The calculated Mean
Weightage Moving Average 2056 absolute deviations (MADs) of forecasted data by
Regression 2459 different forecasting techniques are plotted in Fig. 3. It
is seen that GMDH algorithm gives lowest value of
GMDH Method 704
MAD which is best suit.
Exponential α=0.3 2286
The mean absolute percentage error (MAPE) and
Exponential α= 0.5 2053 mean square error (MSE) are plotted in Fig.4 and Fig.5
Double Exponential α= 0.3, β= 0.5 2861 respectively. The comparison of modelling results
shows that the GMDH model perform better than other
From Table 1 it is seen that the value of MAD models based on terms of mean absolute percentage
due to forecasting by GMDH algorithm is 704. On the error (MAPE) and mean square error (MSE).
other hand all the statistical method gives four digits
MAD. The mean absolute percentage error (MAPE)
and mean square error (MSE) were also calculated and
reported in Table 2 and Table 3 respectively. It is
observed that the GMDH forecast with only 4%
MAPE and nearest value is 9% which is done by
exponential smoothing technique (α= 0.5). Form Table
3000 Output /
Metrics
2500 Value
MAD2000 Post processed result Model fit
1500 Number of observations 52
Normalize mean absolute error 4.65%
1000
(NMAE) root mean square error
Normalize
500 6%
(NRMSE)
0 Standard deviation of residuals 5.8%
Coefficient of determination (R ̂ 2) 0.90
Correlation coefficient 0.95
8% time series.
6%
4% CONCLUSION
2%
0% This paper examined the forecasting accuracy of
different statistical techniques as well as GMDH
predictor. For that purposes ten years secondary sales
data of a cement were collected. There was low
seasonal variation in their sales. Demand forecasting
was performed using extrapolative time series
Forecasting methods methods, such as exponential smoothing with level,
trend, and seasonal components. Besides that moving
Figure 4.Comparison of MAPE of different techniques
average, weighted moving average and regression
method were also used for forecasting the demand. A
nonlinear self-organizing model based on Group
12000000 Method of Data Handling (GMDH) was also applied
10000000 here to derive forecasts.We applied the GMDH
predictor version GMDH Data Science 3. 5. 9.
MSE
8000000
In order to evaluate the accuracy of prediction, various
6000000 performance measures such as MAD. MAPE and
4000000 MSE were calculated. It is found that there is no result
2000000 near to the GMDH predictor. GMDH algorithm
0
forecast with only 0.0367 or 4% error which is
substantially more accurate than statistical method.
References
1. Samsudin, R., Saad, P., and Shabri, A., 2010,
“Hybridizing GMDH and Least squares SVM support
Forecasting methods vector machine for forecasting tourism
demand,”IJRRAS, Vol. 3(3), pp. 274-279.
Figure 5.Comparison of MSE of different techniques
2. Ivakheneko, A.G., and Ivakheneko,G. A., 1995, “A
To assess the performance of GMDH modelling, Review of Problems Solved by Algorithms of the
last 52 months demand were forecasted and compared GMDH,”Pattern Recognition and Image Analysis,Vol.
with the test set. The results of that model along with 5(4), pp. 527-535.
forecasting precision are shown in table 4. Normalize
3. Onwubolu, G. C., Buryan, P. and Lemke, F, 2008
mean absolute error is found to be 4.65% whereas
“Modeling Tool Wear in End-Miling Using Enhanced
normalize RMS is 6%. The fitting accuracy of GMDH
GMDH Learning Networks,” International Journal of
model algorithm is also very good as the value of R2 is
Advance Manufacture Technology, Vol.39(11) pp.
0.90.
1080–1092.
Table 4.Summary results of GMDH modelling
4. Kondo, T., Pandya, A.S., and Nagashino, A.S., 2007, International Journal of Inventive Engineering and
“GMDH-Type Neural Network Algorithm with a Sciences, Vol. 1(9), pp.10-13.
Feedback Loop for Structural Identification of RBF 17. Allen, P. G., 1994, “Economic Forecasting in
Neural Network,” International Journal of Agriculture” International Journal of Forecasting, Vol.
Knowledge-Based and Intelligent Engineering 10, pp. 81-135.
Systems, Vol. 11, pp.157-168.
5. Puig, V., Witczak, M., Nejjari, F., Quevedo, J., and
Korbicz, J., 2007, “A GMDH Neural Network-Based
Approach to Passive Robust Fault Detection Using a
Constraint Satisfaction Backward Test,”Engineering
Applications of Artificial Intelligence,Vol.20,
pp.886-897.
6. Li, F., Upadhyaya B. R., and Coffey, L.A., 2009,
“Model-Based Monitoring and Fault Diagnosis of
Fossil Power Plant Process Units Using Group
Method of Data Handling,”ISA Transactions, Vol.
2,pp. 213-219.
7. Kondo, T., 2007, November. Nonlinear Pattern
Identification by Multi-Layered GMDH-Type Neural
Network Self-Selecting Optimum Neural Network
Architecture. International Conference on Neural
Information Processing (pp. 882-891). Springer Berlin
Heidelberg.
8. Chang, F. J. and Hwang, Y. Y., 1999. "A
Self-Organization Algorithm for Real-Time Flood
Forecast," Hydrological processes, Vol. 13(2),
pp.123-138.
9. Ivakhnenko A. G., 1970, “Heuristic
self-organization on problems of engineering
cybernetics”, Automatic., Vol. 6(3), pp. 207-219.
10. Farlow, S. J., 1981, “The GMDH Algorithm of
Ivakhnenko,” The American Statistician, Vol. 35(4),
pp. 210-215.
11. Muller, J. A., Lemke, F., 2000, “Self-Organizing
Data Mining”, Libri Books. Dresden, Berlin. pp.
67-110
12. Ezennaya, O. S., Isaac, O. E., Okolie U. O., and
Ezeanyim O. I. C, 2014, “Analysis of Nigeria‘s
National Electricity Demand Forecast (2013-2030)”,
International Journal Of Scientific & Technology
Research, Vol.3, pp. 333-340.
13. Gooijer, J. G. D., and Hyndman,R. J., 2006, “25
Years of Time Series Forecasting, International
Journal of Forecasting, Vol. 22, pp. 443– 473
14. Samsudin, R., Saad, P. and Skudai, 2009,
“Combination of Forecasting Using Modified GMDH
and Genetic Algorithm” International Journal of
Computer Information Systems and Industrial
Management Applications, Vol.1, pp.170-176.
15. Kondo, T. and Ueno, J., 2006, "Revised
GMDH-type Neural Network Algorithm with a
Feedback Loop Identifying Sigmoid Function Neural
Network," International Journal of Innovative
Computing, Information and Control, Vol. 2(5),
pp.985-996.
16. Sahu, P. K., Kumar,R.,2013 “Demand Forecasting
for Sales of Milk Product (Paneer) in Chhattisgarh”,