0% found this document useful (0 votes)
42 views6 pages

Comparison of Some Statistical Forecasting Techniques With GMDH Predictor: A Case Study

This document compares statistical forecasting techniques and the GMDH predictor for forecasting cement demand. Time series data on monthly cement sales from January 2007 to February 2016 was collected. Forecasts were generated using exponential smoothing, double exponential smoothing, moving average, weighted moving average, regression, and the GMDH model. The GMDH model is a self-organizing data modeling technique that uses polynomial transfer functions to model nonlinear relationships. Accuracy was evaluated using mean absolute deviation, mean absolute percentage error, and mean squared error. The results showed that the GMDH model produced more accurate forecasts compared to the other statistical techniques based on these error measures.

Uploaded by

Aminur Rahaman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views6 pages

Comparison of Some Statistical Forecasting Techniques With GMDH Predictor: A Case Study

This document compares statistical forecasting techniques and the GMDH predictor for forecasting cement demand. Time series data on monthly cement sales from January 2007 to February 2016 was collected. Forecasts were generated using exponential smoothing, double exponential smoothing, moving average, weighted moving average, regression, and the GMDH model. The GMDH model is a self-organizing data modeling technique that uses polynomial transfer functions to model nonlinear relationships. Accuracy was evaluated using mean absolute deviation, mean absolute percentage error, and mean squared error. The results showed that the GMDH model produced more accurate forecasts compared to the other statistical techniques based on these error measures.

Uploaded by

Aminur Rahaman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Comparison of Some Statistical Forecasting Techniques with GMDH Predictor: A Case Study 16

COMPARISON OF SOME STATISTICAL FORECASTING TECHNIQUES WITH


GMDH PREDICTOR: A CASE STUDY
Syed Misbah Uddin*, Aminur Rahman, Emtiaz Uddin Ansari
Industrial and Production Engineering Department,
ShahJalal University of Science and Technology, Sylhet, Bangladesh.
*Corresponding e-mail:[email protected].

Abstract: Demand forecasts are extremely important for manufacturing industry and also needed for all type of
business and business suppliers for distribution of finish products to the consumer on time. This study is
concerned with the determination of accurate models for forecasting cement demand. In this connection this
paper presents results obtained by using a self-organizing model and compares them with those obtained by usual
statistical techniques. For this purpose, Monthly sales data of a typical cement ranging from January, 2007 to
February, 2016 were collected. A nonlinear modelling technique based on Group Method of Data Handling
(GMDH) is considered here to derive forecasts. Forecast were also made by using various time series smoothing
techniques such as exponential smoothing, double exponential smoothing, moving average, weightage moving
average and regression method. The actual data were compared to the forecast generated by the time series model
and GMDH model. The mean absolute deviation (MAD, mean absolute percentage error (MAPE) and mean
square error (MSE) were also calculated for comparing the forecasting accuracy. The comparison of modelling
results shows that the GMDH model perform better than other statistical models based on terms of mean absolute
deviation (MAD), mean absolute percentage error (MAPE) and mean square error (MSE).

Keywords: Forecast, GMDH algorithm, Time series, MAPE, MSE.

INTRODUCTION programs and algorithms were the primary practical


results achieved at the base of the new theoretical
In the past few decades, the cement industry has
principles. The method was quickly settled in the large
emerged as the fastest growing sector in Bangladesh
number of scientific laboratories worldwide due to
due to massive construction work in private and public
open code sharing. At that time code sharing was quite
sector. Real estate has become an important source of
a physical action since the internet is at least 5 years
economic activity, employment, tax revenue and
younger than GMDH. Despite this fact the first
income. Therefore, every company needs to
investigation of GMDH outside the Soviet Union had
understand its market demand to help formulate
been made soon by R. Shankar in 1972. Later on
responsive policies on cement production and
different GMDH variants were published by Japanese
marketing properly. More accurately forecasting
and Polish scientists.
demand would facilitate for assisting managerial,
operational and tactical decision making. Therefore The main idea of GMDH is the use of feed-forward
the selection of forecasting model is the important networks based on short-term polynomial transfer
criteria that will influence to the forecasting accuracy1. functions whose coefficients are obtained using
The GMDH algorithm has been successfully used to regression combined with emulation of the
deal with uncertainty, linear or nonlinearity of systems self-organizing activity behind NN structural
in a wide range of disciplines such as economy, learning10. To improve the performance of the GMDH
ecology, medical diagnostics, signal processing, fossil algorithm, Barron (1988) gave a comprehensive
power plant process, electric power industry and overview of some early developments of network, and
control systems2-6. The revised GMDH algorithms7,8 introduced the polynomial network training algorithm
have been introduced to model dynamic systems in (PNETTR). Elder (1996) proposed Synthesis of
flood forecast and petroleum resource prediction with Polynomial Network (ASPN) algorithm to improve
some success. the GMDH algorithm.
Group Method of Data Handling (GMDH) J.A.Muller and Frank Lemke developed and
algorithm is a multivariate analysis method for improved self-organizing data mining algorithms on
modeling and identifying uncertainty on linear or the basis of the above results in 1990s11. Further
nonlinearity systems. This algorithm was first enhancements of the GMDH algorithm have been
introduced in 1967 by A.G. Ivakhnenko9. This realized in the “Knowledge Miner” software. The
approach from the very beginning was a GMDH algorithm has gradually become an effective
computer-based method so, a set of computer tool for modeling, forecasting, and decision support

Journal of Mechanical Engineering, Vol. ME 47, December 2017


Transaction of the Mechanical Engineering Division, The Institution of Engineers, Bangladesh
Comparison of Some Statistical Forecasting Techniques with GMDH Predictor: A Case Study 17

ܻሺ‫ݔ‬ଵ , … … . , ‫ݔ‬௡ ሻ
and pattern recognition of complex systems. There are
processes for which it is needed to know their future or ௡
௡ ௡

= ܽ௢ + ෍ ܽ௜ ‫ݔ‬௜ + ා ෍ ܽ௜௝ ܺ௜ ܺ௝
to analyze inter-relations.
The purpose of the study is to determine accurate
model for demand forecasting of cement. For this ௜ୀଵ ௝ୀଵ
௜ୀଵ

secondary sales data of cement have been collected


and forecasts have been made by applying different

+ ා ෎ ෍ ܽ௜௝ ௞ ܺ௝ ܺ௝ ܺ௞
time series techniques. GMDH method has also been
used to derive the forecast.The mean absolute
௞ୀଵ
percentage error (MAPE) and mean square error ௝ୀଵ
௜ୀଵ
+ … … … ..
(MSE) have been calculated for comparing the
forecasting accuracy among different techniques.

In this case, x represents the input to the system,


METHODOLOGY n is the number of inputs and a are coefficients or
weights. However, for most application the quadratic
This is a case study research based on time series form are called as partial descriptions (PD) for only
data of cement industry. The data used in this case two variables is used in the form to predict the output.
‫ீ ݕ‬ெ஽ு = ܽ௢ + ܽଵ ‫ݔ‬௜ + ܽଶ ‫ݔ‬௝ + ܽଷ ‫ݔ‬௜ ‫ݔ‬௝ +
study are monthly sales data of cement. The data span

ܽସ ‫ݔ‬௜ ‫ݔ‬௜ +…………………………………. (1)


the period from January 2007 to February 2016. The
dataset consists of 110 months’ time series data. Data
were analyzed by using various time series model such
as moving average, weighted moving average, single
exponential smoothing, double exponential smoothing To obtain the value of the coefficients a for each
and least square method of simple linear regression. m models, a system of Gauss normal equations is
solved. The coefficient of nodes in each layer are
In this study, we use the value of α 0.3 and 0.5 for
expressed in the form
A = (ܺ ் ܺሻିଵ ܺ ் ܻ
single exponential smoothing method. Simple
exponential smoothing does not do well when there is
a trend in the data. In such situations, several methods
Where
Y = ሺܻଵ ܻଶ … … … … … … … . . ܻெ ሻ்
were devised under the name "double exponential
smoothing" or "second-order exponential
A = [ܽ௢ , ܽଶ, ܽଷ, ܽସ, ܽହ ]
smoothing. The basic idea behind double exponential

1 ‫ݔ‬ଵ௣ ‫ݔ‬ଵ௤ ‫ݔ‬ଵ௣ ‫ݔ‬ଵ௣ ‫ݔ‬ଵ௣ ‫ݔ‬ଵ௤


smoothing is to introduce a term to take into account
ଶ ଶ
‫ۍ‬ ‫ې‬
the possibility of a series exhibiting some form of
‫ ێ‬1 ‫ݔ‬ଶ௣ ‫ݔ‬ଶ௤ ‫ݔ‬ଶ௣ ‫ݔ‬ଶ௣ ‫ݔ‬ଶ௣ ‫ݔ‬ଶ௤ ‫ۑ‬
trend. This slope component is itself updated via ଶ ଶ

‫ێ‬ . ‫ۑ‬
X=‫ێ‬ ‫ۑ‬
exponential smoothing. One method sometimes
.
‫ێ‬ ‫ۑ‬
referred to as "Holt-Winters double exponential
.
‫ێ‬ ‫ۑ‬
smoothing are followed here. One of two smoothing

‫ێ‬ . ‫ۑ‬
factor is α which is called data smoothing factor and
‫ێ‬1 ‫ݔ‬ெ௣ ‫ݔ‬ெ௤ ‫ݔ‬ெ௣ ‫ݔ‬ெ௣ ‫ݔ‬ெ௣ ‫ݔ‬ெ௤ ‫ۑ‬
ଶ ଶ
it’s value, 0 < α < 1, and the other one β is the trend
smoothing factor, 0 < β < 1. We also used the GMDH
predictor version GMDH Data Science 3. 5. 9 to
derive the forecast. Out of 110 data 58 months data are
used for the training set and rest of the data are used M is the number of observations in the training
for evaluation in checking set. set.

The GMDH method was originally formulated to 2.1 Measurement of Forecasting Error
solve for higher order regression polynomials In order to evaluate the forecasting accuracy of
especially for solving modelling and classification different techniques various central tendency
problem. General connection between inputs and measures as the loss function were also calculated.
output variables can be expressed by a complicated
polynomial series in the form of the Volterra series, Mean Absolute Deviation
known as the Kolmogorov-Gabor polynomial: A common method for measuring overall
forecast error is the mean absolute deviation. Heifer
and Render (2001) noted that this value is computed
by dividing the sum of the absolute values of the
individual forecast error by the sample size (the

MAD =୬ ෌௡ୀଵ |ሺ Actual − Forecastሻ|


ଵ ௡
number of forecast periods). The equation is:

Journal of Mechanical Engineering, Vol. ME 47, December 2017


Transaction of the Mechanical Engineering Division, The Institution of Engineers, Bangladesh
Comparison of Some Statistical Forecasting Techniques with GMDH Predictor: A Case Study 18

n = the number of periods12.


30000

28000
Mean Squared Error (MSE)
In statistics, the mean squared error (MSE) of 26000
an estimator measures the average of the squares of 24000

Sales Volume
the "errors", that is, the difference between the 22000
estimator and what is estimated. MSE is a risk 20000
function, corresponding to the expected value of the
18000
squared error loss or quadratic loss. The difference
occurs because of randomness or because the 16000

estimator doesn't account for information that could 14000


produce a more accurate estimate. 12000
The MSE is the second moment (about the 0 10 20 30 40 50 60 70 80 90 100 110
origin) of the error, and thus incorporates both Month
the variance of the estimator and its bias. For
an unbiased estimator, the MSE is the variance of the Figure 1: Monthly sales data (Jan 2007 to Feb 2016)
estimator. Like the variance, MSE has the same units
of measurement as the square of the quantity being After collecting sales data GMDH algorithm and
estimated. Jarrett (1991) stated that the mean square various statistical forecasting techniques were used to
error (MSE) is a generally accepted technique for forecast. The mean absolute deviation (MAD), mean
evaluating exponential smoothing and other methods. absolute percentage error (MAPE) and mean square

The equation is: error (MSE) were also calculated to assess forecasting
෌ {୅ୡ୲୳ୟ୪ି୊୭୰ୣୡୟୱ୲ }మ
MSE = ೖసబ
performance of different models.

Where: 3.1 Analysis by GMDH algorithm
n = the number of periods11.
GMDH algorithm consists of set of steps that are
described below:
Mean Absolute Percentage Error (MAPE)
Mean Absolute Percent Error (MAPE) is the Step 1: First N observations of regression-type
most common measure of forecast error. MAPE data are taken. The collected load data are first
functions best when there are no extremes to the data normalized with respect to their individual base value
(including zero). With zero or near-zero, MAPE can in order to restrict the variation of data within the same

൫‫ݔ‬ଵ , ‫ݔ‬ଶ , ‫ݔ‬ଷ , ‫ݔ‬ସ , … … … … ‫ݔ‬ெ ൯ where M is the total


give a distorted picture of error. The error on a level. Those normalized data are denoted by
near-zero item can be infinitely high, causing a
distortion to the overall error rate when it is averaged number of input. The original data is separated into the
in. For forecasts of items that are near or at zero training and test sets14. In this study total 110 data were
volume, Symmetric Mean Absolute Percent Error separated into training (58) and test (52) sets. The 58
(SMAPE) is a better measure9. data is used for the estimation of the partial
MAPE is the average absolute percent error for descriptions which describe the partial characteristics
each time period or forecast minus actuals divided by of the nonlinear system. The 52 data is used for
actual: organizing the complete description which describes


the complete characteristic of the nonlinear system.
MAPE = ෍ ∗ 100% Step 2: Select ൫௠ ൯ = mሺm − 1ሻ/2 new input
ଵ |୅ୡ୲୳ୟ୪ି୊୭୰ୣୡୟୱ୲|
୬ ୅ୡ୲୳ୟ୪ ଶ
௡ୀଵ
variables according to all possibilities of connection
by each pair of inputs in the layer. Construct the
regression polynomial for this layer by forming the
DATA COLLECTION AND ANALYSIS
quadratic expression which approximates the output y
Data Collection is a significant aspect of any type of in equation (1).
research study. The data used in this case study are
of these൫௠ ൯ input variables, according to the value of
monthly sales data of cement. The data span the period Step 3: Identify the single best input variable out
from January 2007 to February 2016. The time series ଶ
plot is given Fig. 1. mean square error (MSE). The input of variables that
give the best results in the first layer, are allowed to

Set the new input ሺ‫ݔ‬ଵ ‫ݔ‬ଶ ‫ݔ‬ଷ ‫ݔ‬ସ … … … … ‫ݔ‬ெ ሻ and
form second layer candidate model of the equation (1).

ሺ ‫ ܯ = ܯ‬+ 1ሻ Models of the second layer are


evaluated for compliance by using MSE, and again the
input variables that give best results will proceed to
form third layer candidate models. This procedure is

Journal of Mechanical Engineering, Vol. ME 47, December 2017


Transaction of the Mechanical Engineering Division, The Institution of Engineers, Bangladesh
Comparison of Some Statistical Forecasting Techniques with GMDH Predictor: A Case Study 19

carried out as long as the MSE for the test data set 3 it is clear that the model with the minimum value of
decrease compared with the value obtained at the the MSE is the GMDH model.
previous one as shown in Fig. 2. After the best models
of each layer have been selected, the output model is Table 2. MAPE of different forecasting methods
selected by the MSE. The model with the minimum
value of the MSE is selected as the output model15. Method MAPE
3 month Moving Average 11%
6 month Moving Average 14%
12 month Moving Average 11%
Weightage Moving Average 10%
Regression 12%
MSE

GMDH Method 4%
Exponential α=0.3 11%

Minimum Ivakhnenko Exponential α= 0.5 9%


Polynomial Double Exponential α= 0.3, β= 0.5 13%
0 1 2 3 4 5 6 7 8

Iterations
Table 3. MSE of different forecasting methods
Figure 2. Stopping criteria of GMDH algorithm
Method MSE
3 Month Moving Average 7994519
3.2 Analysis by Statistical method 6 Month Moving Average 10355301
Various time series smoothing techniques such as 12 Month Moving Average 7710194
exponential smoothing, double exponential smoothing,
moving average and regression method were used for Weightage Moving Average 6291543
forecasting the load demand. Absolute deviations Regression 9177720
were also calculated. The mean absolute deviations GMDH Method 824882
(MADs) found from these calculations are listed in
table 1. Exponential α= 0.3 7619269
Exponential α= 0.5 6220179
Double Exponential α= 0.3, β= 0.5 11913465
Table 1. MAD of different forecasting methods
Method MAD
3 month Moving Average 2306
RESULTS AND DISCUSSION
6 month Moving Average 2791
12 month Moving Average 2230 After completing data analysis we have come out
with some informative results. The calculated Mean
Weightage Moving Average 2056 absolute deviations (MADs) of forecasted data by
Regression 2459 different forecasting techniques are plotted in Fig. 3. It
is seen that GMDH algorithm gives lowest value of
GMDH Method 704
MAD which is best suit.
Exponential α=0.3 2286
The mean absolute percentage error (MAPE) and
Exponential α= 0.5 2053 mean square error (MSE) are plotted in Fig.4 and Fig.5
Double Exponential α= 0.3, β= 0.5 2861 respectively. The comparison of modelling results
shows that the GMDH model perform better than other
From Table 1 it is seen that the value of MAD models based on terms of mean absolute percentage
due to forecasting by GMDH algorithm is 704. On the error (MAPE) and mean square error (MSE).
other hand all the statistical method gives four digits
MAD. The mean absolute percentage error (MAPE)
and mean square error (MSE) were also calculated and
reported in Table 2 and Table 3 respectively. It is
observed that the GMDH forecast with only 4%
MAPE and nearest value is 9% which is done by
exponential smoothing technique (α= 0.5). Form Table

Journal of Mechanical Engineering, Vol. ME 47, December 2017


Transaction of the Mechanical Engineering Division, The Institution of Engineers, Bangladesh
Comparison of Some Statistical Forecasting Techniques with GMDH Predictor: A Case Study 20

3000 Output /
Metrics
2500 Value
MAD2000 Post processed result Model fit
1500 Number of observations 52
Normalize mean absolute error 4.65%
1000
(NMAE) root mean square error
Normalize
500 6%
(NRMSE)
0 Standard deviation of residuals 5.8%
Coefficient of determination (R ̂ 2) 0.90
Correlation coefficient 0.95

Our findings have several important implications.


Forecasting methods Useless input variables are eliminated and useful input
Figure 3.Comparison of MAD of different techniques variables are selected automatically, the structure
parameters and the optimum GMDH architecture can
14% be organized automatically. The case study on the
12% cement time series data testing demonstrated that the
10% GMDH model is robust in the forecasting of nonlinear
MAPE

8% time series.
6%
4% CONCLUSION
2%
0% This paper examined the forecasting accuracy of
different statistical techniques as well as GMDH
predictor. For that purposes ten years secondary sales
data of a cement were collected. There was low
seasonal variation in their sales. Demand forecasting
was performed using extrapolative time series
Forecasting methods methods, such as exponential smoothing with level,
trend, and seasonal components. Besides that moving
Figure 4.Comparison of MAPE of different techniques
average, weighted moving average and regression
method were also used for forecasting the demand. A
nonlinear self-organizing model based on Group
12000000 Method of Data Handling (GMDH) was also applied
10000000 here to derive forecasts.We applied the GMDH
predictor version GMDH Data Science 3. 5. 9.
MSE

8000000
In order to evaluate the accuracy of prediction, various
6000000 performance measures such as MAD. MAPE and
4000000 MSE were calculated. It is found that there is no result
2000000 near to the GMDH predictor. GMDH algorithm
0
forecast with only 0.0367 or 4% error which is
substantially more accurate than statistical method.

References
1. Samsudin, R., Saad, P., and Shabri, A., 2010,
“Hybridizing GMDH and Least squares SVM support
Forecasting methods vector machine for forecasting tourism
demand,”IJRRAS, Vol. 3(3), pp. 274-279.
Figure 5.Comparison of MSE of different techniques
2. Ivakheneko, A.G., and Ivakheneko,G. A., 1995, “A
To assess the performance of GMDH modelling, Review of Problems Solved by Algorithms of the
last 52 months demand were forecasted and compared GMDH,”Pattern Recognition and Image Analysis,Vol.
with the test set. The results of that model along with 5(4), pp. 527-535.
forecasting precision are shown in table 4. Normalize
3. Onwubolu, G. C., Buryan, P. and Lemke, F, 2008
mean absolute error is found to be 4.65% whereas
“Modeling Tool Wear in End-Miling Using Enhanced
normalize RMS is 6%. The fitting accuracy of GMDH
GMDH Learning Networks,” International Journal of
model algorithm is also very good as the value of R2 is
Advance Manufacture Technology, Vol.39(11) pp.
0.90.
1080–1092.
Table 4.Summary results of GMDH modelling

Journal of Mechanical Engineering, Vol. ME 47, December 2017


Transaction of the Mechanical Engineering Division, The Institution of Engineers, Bangladesh
Comparison of Some Statistical Forecasting Techniques with GMDH Predictor: A Case Study 21

4. Kondo, T., Pandya, A.S., and Nagashino, A.S., 2007, International Journal of Inventive Engineering and
“GMDH-Type Neural Network Algorithm with a Sciences, Vol. 1(9), pp.10-13.
Feedback Loop for Structural Identification of RBF 17. Allen, P. G., 1994, “Economic Forecasting in
Neural Network,” International Journal of Agriculture” International Journal of Forecasting, Vol.
Knowledge-Based and Intelligent Engineering 10, pp. 81-135.
Systems, Vol. 11, pp.157-168.
5. Puig, V., Witczak, M., Nejjari, F., Quevedo, J., and
Korbicz, J., 2007, “A GMDH Neural Network-Based
Approach to Passive Robust Fault Detection Using a
Constraint Satisfaction Backward Test,”Engineering
Applications of Artificial Intelligence,Vol.20,
pp.886-897.
6. Li, F., Upadhyaya B. R., and Coffey, L.A., 2009,
“Model-Based Monitoring and Fault Diagnosis of
Fossil Power Plant Process Units Using Group
Method of Data Handling,”ISA Transactions, Vol.
2,pp. 213-219.
7. Kondo, T., 2007, November. Nonlinear Pattern
Identification by Multi-Layered GMDH-Type Neural
Network Self-Selecting Optimum Neural Network
Architecture. International Conference on Neural
Information Processing (pp. 882-891). Springer Berlin
Heidelberg.
8. Chang, F. J. and Hwang, Y. Y., 1999. "A
Self-Organization Algorithm for Real-Time Flood
Forecast," Hydrological processes, Vol. 13(2),
pp.123-138.
9. Ivakhnenko A. G., 1970, “Heuristic
self-organization on problems of engineering
cybernetics”, Automatic., Vol. 6(3), pp. 207-219.
10. Farlow, S. J., 1981, “The GMDH Algorithm of
Ivakhnenko,” The American Statistician, Vol. 35(4),
pp. 210-215.
11. Muller, J. A., Lemke, F., 2000, “Self-Organizing
Data Mining”, Libri Books. Dresden, Berlin. pp.
67-110
12. Ezennaya, O. S., Isaac, O. E., Okolie U. O., and
Ezeanyim O. I. C, 2014, “Analysis of Nigeria‘s
National Electricity Demand Forecast (2013-2030)”,
International Journal Of Scientific & Technology
Research, Vol.3, pp. 333-340.
13. Gooijer, J. G. D., and Hyndman,R. J., 2006, “25
Years of Time Series Forecasting, International
Journal of Forecasting, Vol. 22, pp. 443– 473
14. Samsudin, R., Saad, P. and Skudai, 2009,
“Combination of Forecasting Using Modified GMDH
and Genetic Algorithm” International Journal of
Computer Information Systems and Industrial
Management Applications, Vol.1, pp.170-176.
15. Kondo, T. and Ueno, J., 2006, "Revised
GMDH-type Neural Network Algorithm with a
Feedback Loop Identifying Sigmoid Function Neural
Network," International Journal of Innovative
Computing, Information and Control, Vol. 2(5),
pp.985-996.
16. Sahu, P. K., Kumar,R.,2013 “Demand Forecasting
for Sales of Milk Product (Paneer) in Chhattisgarh”,

Journal of Mechanical Engineering, Vol. ME 47, December 2017


Transaction of the Mechanical Engineering Division, The Institution of Engineers, Bangladesh

You might also like