0% found this document useful (0 votes)
15 views13 pages

1 s2.0 S1568494616303891 Main

Uploaded by

Code BugMx
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views13 pages

1 s2.0 S1568494616303891 Main

Uploaded by

Code BugMx
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Applied Soft Computing 49 (2016) 663–675

Contents lists available at ScienceDirect

Applied Soft Computing


journal homepage: www.elsevier.com/locate/asoc

Modelling a combined method based on ANFIS and neural network


improved by DE algorithm: A case study for short-term electricity
demand forecasting
Yi Yang, Yanhua Chen ∗ , Yachen Wang, Caihong Li, Lian Li
School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, PR China

a r t i c l e i n f o a b s t r a c t

Article history: Electricity demand forecasting, as a vital tool in the electricity market, plays a critical role in power
Received 15 January 2016 utilities, which can not only reduce production costs but also save energy resources, thus making the
Received in revised form 7 June 2016 forecasting techniques become an indispensable part of the energy system. A novel combined forecasting
Accepted 31 July 2016
method based on Back Propagation (BP) neural network, Adaptive Network-based Fuzzy Inference System
Available online 3 August 2016
(ANFIS) and Difference Seasonal Autoregressive Integrated Moving Average (diff-SARIMA) are presented
in this paper. Firstly, the combined method uses all the three methods (BP, ANFIS, diff-SARIMA) to forecast
Keywords:
respectively, and the three forecasting results were obtained. By multiplying optimal weight coefficients
Electricity demand forecasting
Combined forecasting method
of the three forecasting results respectively and then adding them up, in the end the final forecasting
BP results can be obtained. Among the three individual methods, BP and ANFIS had the ability to deal with
ANFIS the nonlinearity data, and diff-SARIMA had the ability to deal with the linearity and seasonality data. So
diff-SARIMA the combined method eliminates drawbacks and incorporates in the merits of the individual methods. It
DE has the capability to deal with the linearity, nonlinearity and seasonality data. In order to optimize weight
coefficients, Differential Evolution (DE) optimization algorithm is brought into the combined method. To
prove the superiority and accuracy, the capability of the combined method is verified by comparing it
with the three individual methods. The forecasting results of the combined method proved to be better
than all the three individual methods and the combined method was able to reduce errors and improve
the accuracy between the actual values and forecasted values effectively. Using the half-hour electricity
power data of the State of New South Wales in Australia, relevant experimental case studies showed that
the proposed combined method performed better than the other three individual methods and had a
higher accuracy.
© 2016 Published by Elsevier B.V.

Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664
2. Methodologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665
2.1. BP neural network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665
2.2. Adaptive network based fuzzy inference system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
2.3. Seasonal autoregressive integrated moving average . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
2.4. Differential evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
3. Combined forecasting method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
3.1. Theory of the combined forecasting method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
3.2. Determining the weights in the combined method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
4. Forecasting statistical metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
5. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668

∗ Corresponding author.
E-mail address: [email protected] (Y. Chen).

http://dx.doi.org/10.1016/j.asoc.2016.07.053
1568-4946/© 2016 Published by Elsevier B.V.
664 Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675

5.1. Data pre-analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668


5.2. The forecasting results of BP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668
5.3. The forecasting results of ANFIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669
5.4. The forecasting results of diff-SARIMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669
5.5. The weight coefficients of the combined method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669
5.6. Comparisons between the combined method and other individual methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670
6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674

1. Introduction system [18–20]. Among all these methods, artificial neural net-
works and fuzzy theories are well received. Due to that they have
As people’s living standard and social-economic level have the capacity to process the nonlinear data. Xiao et al. [21] developed
improved significantly, natural energy consumption continues to a BP neural network with a rough set to forecast the short-term time
grow. Energy shortage becomes increasingly serious, so that more series data, and the results proved the priority that BP had. Azwadi
and more countries have attached great significance to these hot et al. [22] used the ANFIS to predict the temperature and flow fields
issues. As the major impetus to improve the development of soci- in a lid-driven cavity.
ety, electricity, one of the most important energy resources, plays Despite the introduction of artificial intelligence, each of the
a crucial role in the power system. It is known that, electricity is a individual methods still cannot get rid of the fact that none of
resource which is difficult to store; besides, the electric system is them are able to give birth to the desired outcomes because of
affected by various unstable factors, namely weather, population, their disadvantages. For instance, neural networks attain the results
holidays, emergency and more. All the enormous problems make closer to local ones not the global optimal ones. Expert systems
it difficult for the electricity production industry to estimate their excessively rely on knowledge and cannot gain the optimal results
output. Thus, an accurate and precise forecasting method is needed all the time while the grey prediction systems are suitable for
in the electricity market. On the contrary, inappropriate electricity exponential growth models. Thus, by considering every method’s
demand forecasting will be counterproductive. An overestimated merits, taking full advantage of them, the concept of the hybrid
method will increase the workload of the electricity production and combination methods developed rapidly. The thought of the
and dissipate the energy resources. Bunn and Farmer pointed out combined method was first introduced by Bates et al. [23], who
that 10 million dollars of extra costs will increase in operating proved that the combined methods were more efficient and easier
because of the 1% increasing forecasting errors in electricity pro- than the single ones. In addition, hybrid and combined methods
duction [1]. Meanwhile an underestimated method will paralyze aggregate the advantages of multiple separate methods. Because
the electricity grid, and some regions will face power failure sit- of these advantages, hybrid and combined methods are widely
uations. It’s quite clear that a good forecasting method which can used in different applications. Xiong et al. [24] proposed a hybrid
avoid many disasters is the key to respond to the future electric- method based on the support vector regression with a firefly algo-
ity demand. Therefore, precise electricity demand forecasting is the rithm to forecast stock price index. Yu et al. [25] used a novel
prerequisite to meet the demand, no matter whether for developed hybrid structure named MPSO-BP adaptive algorithm by using the
countries or developing countries. No matter for what kind of elec- Radial Basis Function Network (RBFN). Tan et al. [26] developed
tricity demand forecasts, long-term, midterm or others, developing a combined method by using three individual methods namely
a novel method which cannot only be effective but also improve the wavelet transform, ARIMA and Generalized Autoregressive Condi-
forecasting accuracy is a must [2]. tional Heteroscedasticity (GARCH) to forecast the electricity price.
More and more novel methods have been proposed in recent Although hybrid and combined methods have priorities, they
years because of the necessity for better forecasting performance. focus on different aspects. Hybrid methods use a series of methods
For example regression based methods and time series based meth- to process the data in advance, such as: noise reduction, seasonal
ods. Among the regression methods, linear regression methods are adjustment and cluster, while combined methods use weight coef-
the most widely used ones. Goia et al. [3] applied linear regression ficients compared with the individual methods which constitute
method to forecast the peak load. Bianco et al. [4,5] used the linear the hybrid methods. In reference to the merits of these two kinds
regression models to forecast the electricity consumption in Italy. of methods, this paper proposes a novel combined method whose
ARIMA is the most commonly used method in time series mod- advantage is not only linear time series data but also nonlinear
els. For instance, Dong et al. [6] employed the time series approach time series data can be processed. The combined forecasting meth-
to forecast long-term load, Erdogdu et al. [7] analyzed electrical ods are as follows: diff-SARIMA method deals with the linear data,
demand by using ARIMA in Turkey, and Wang et al. [8] proposed while BP method and ANFIS method deal with the nonlinear data.
seasonal ARIMA to forecast electricity demand in China. Because During the parameter optimization process, the DE optimization
of the nonlinear characteristics which the electricity demand time algorithm was utilized to optimize the weight coefficients of the
series apparently experiences, even though these methods (regres- combined methods. These three methods were used separately in
sion based methods, time series based methods) are presently different applications.
mature, they still have drawbacks so that they cannot properly fore- The BP neural network is a multilayer feedforward network
cast the nonlinear load series. Moreover, these methods are affected using the input-output model mapping relation theory which was
by reliability and availability of external factors to a great extent. trained by error back propagation algorithm. By using the steepest
For the last several decades, Artificial Intelligence (AI) methods grades descent or other methodologies to modulate all the weights
have demonstrated the formidable ability in dealing with the sea- and thresholds of the whole network, after several iterations, the
sonal and nonlinear load data. More and more new methodologies least sum of square error can be procured. BP, as one of the most
and techniques have emerged, such as the fuzzy logic system [9,10], traditional models, was welcomed in many applications. Reliability
the expert system [11,12], the grey prediction models [13,14] forecasting models for electrical distribution systems considering
the Artificial Neural Networks (ANN) [15–17], the fuzzy inference component failures and planned outages was used based on BP
Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675 665

by Xie et al. [27]. Khashman [28] proposed a modified BP learning Input layer Hidden layer Output layer
algorithm with added emotional coefficients.
The Autoregressive Integrated Moving Average (ARIMA) fore- x1 z1
casting method is regarded as a traditional time series forecasting
method came into being [29,30]. Its basic thought is that the fore-
casted object which comes into being as time goes by is regarded as x2 
a random series. With certain approximate mathematical models to
describe the sequence, the forecasting data can be obtained once   zt
the model becomes recognized. The ARIMA method was widely xn
employed in different areas, such as production values area pre-
diction [31], fuel wood prices forecasting [32] and wind speed
wji wil
forecasting [33]. Besides using SARIMA, the difference is also intro-
duced in this paper. The specific procedure was that the difference
Fig. 1. The structure of BP.
operation was done before putting the original data into SARIMA
method. Because of the significant linear trend from the original
data present, diff-SARIMA can level the data off. Through experi- (4) Apply DE to adjust the weight coefficients between the three
mentation, the result outperformed better than the single SARIMA. methods. If the results are within the accuracy limitation, the
ANFIS which brought learning principles into the fuzzy systems iteration stops; otherwise, the weights still should be adjusted
was first presented by Roger Jang, creating a self-adaptive system by DE.
that can imitate human’s recognition. A neural network can train (5) Compare the other three single methods that are mentioned
and generate data autonomously once embedded into a fuzzy struc- above with the proposed combined method.
ture. After amending, the best subordinate function and fuzzy rules
can be highly summarized. ANFIS is commonly used in the process The rest of the paper is organized as follows: Three individ-
of nonlinear data applications. Petković et al. [34] employed ANFIS ual methods including BP, ANFIS and diff-SARIMA and DE optimal
to estimate silicone rubber mechanical properties. In order to fore- algorithm are displayed in Section 2. In Section 3, the com-
cast the rainfall, El-Shafie et al. [35] proposed a model based on bined method is introduced. In Section 4, statistical measures
ANFIS. The ANFIS was also used in the prediction of the solubility of forecasting performance are displayed. Section 5 presents the
of carbon dioxide by Khajeh et al. [36]. comparison between the combined method and other separate
Despite combining these three methods into a novel method a methods. Finally, Section 6 summarizes the whole paper.
problem still exists: the proportion coefficient should be adjusted in
the combined method, so that the results can be the best. To address
the problem and improve the accuracy, some parameter optimized 2. Methodologies
algorithms have been applied in many applications. Ke et al. [37]
applied BP optimized by GA (Genetic algorithms) to forecast the Many methods exist in regard to forecast electricity production
industry loan of electricity power. A grey model based on PSO (Par- and consumption, such as the time series methods, the classical
tial Swarm Optimization) was proposed to optimize parameters statistic methods and the artificial intelligence methods. This paper
by Wang et al. [38]. Chen et al. [39] used an efficient Ant Colony takes advantage of different methods’ merits and proposes a novel
Optimization (ACO) to optimize image feature. combined method consisting of three methods: the BP neural net-
In this paper, DE was chosen to be the evolutionary algorithm. work, ANFIS, and diff-SARIMA.
As with other evolution algorithms such as: GA, PSO, ACO, which
are all part of the swarm intelligence based on stochastic parallel 2.1. BP neural network
algorithms. DE has the special ability to store knowledge memory
in order to keep track of the current search dynamically, thus the The artificial neural network possesses the characteristics of
optimal outcomes can be obtained by adjusting the parameters. self-organization, adaptive and self-learning, as well as non-
Through a comparison of GA, PSO and ACO, DE has the potent abil- linearity, non-locality, non-steady and non-convex and other
ity of global convergence and robustness that can receive the best characteristic, which makes it become a powerful tool to solve com-
results. However, GA and PSO possess too many parameters that plex problems [40]. As a kind of ANN, BP is extensively used in many
greatly influence the results. In real applications, it will increase cases.
the difficulty of the algorithm, especially the high-dimension opti- This research focused on a three-layer BP neural network, and
mization problems. Thus, DE is the better algorithm to adjust the its framework is presented in Fig. 1. Each neuron in the network
parameters. is a node, and the network is composed of an input layer, hid-
Combining BP, ANFIS and diff-SARIMA and adjusting the weight den layer and output layer. Linking with the weight coefficients,
coefficients by DE, it should be assumed that the combined method the front layer and the back layer are connected. It’s notewor-
is able to recognize most of the seasonal, linear and nonlinear pat- thy that the hidden layer can be a layer, as well as can also be
terns. a multilayer. The BP process was made up of two sections which
The detailed procedures for the proposed method are listed were forward and error back propagation. When the BP neural net-
below: work starts learning, the input signal spreads in the input layer,
and then reaches to the hidden layer. After a series of process-
ing, in the end, it comes to the output layer. During the forward
(1) Input original data into BP neural network, after several times propagation, if the expected output can be acquired, the process of
of training and testing, and then get the forecasted data. learning terminates. Otherwise, it will turn to the process named
(2) Gather the time series data by letting the latter data subtract back propagation. Back-propagation is a procedure where the error
the former data in turn to get the new time series, and then put signal (the difference between the output and the network output)
the new time series into the SARIMA method to get the final is backward calculated according to the original connection path
forecasted data. [41]. In order to decrease the error signal, the weights of each neu-
(3) Put the original series into ANFIS, and attain the results. ron were adjusted by the means of gradient descent method. The
666 Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675

where x is the input to node I, and Ai is the linguistic label associated


with this node function. Premise parameters change the shape of
the membership function.
Layer 2 
 Every node in this layer is a circular node labeled out. i.e.
-norm operation:

O2i = Ai (x) × Bi (y), i = 1, 2 (6)

Layer 3
Each node N computes the ratio of the ith rule’s firing weight to
the sum of all rule’s firing weights:
Fig. 2. The structure of ANFIS.
wi
O3i = w̄i = , i = 1, 2 (7)
w1 + w2
training process is described by the following two steps to update
these weighted values [42]. The outputs of this layer are called normalized firing strengths.
Layer 4
Compute the contribution of the ith rule to the overall output.
(1) Hidden layer stage: The outputs of all neurons in the hidden
layer are calculated by the following equations: O4i = w̄i fi = w̄i (ai x + bi y + ci ), i = 1, 2 (8)
 

m where w̄i is the output of layer 3, and ai x + bi y + ci is the param-
neti = wji xj , i = 1, 2, ..., n, (1) eter set.
j=0 Layer 5 
The signal node computes the final output as the summation
yi = fH (neti ), i = 1, 2, ..., n. of all incoming signals:
(2)
 
wf
O5i = w̄i fi = i i i (9)
i
wi
where neti is the activation value of the ith node, yi is the output of i
the hidden layer, and fH is called the activation function of a node,
The final output of adaptive neuro-fuzzy inference system is
usually a sigmoid function as follows:
expressed as:
1 w1 w2
fH (x) = (3) fout = w̄1 f1 + w̄2 f2 = f1 + f2 = (w̄1 x)p1 + (w̄1 y)q1
1 + exp(−x) w1 + w2 w1 + w2
+(w̄1 )r1 + (w̄2 x)p2 + (w̄2 y)q2 + (w̄2 )r2
(2) Output stage: The outputs of all neurons in the output layer are
(10)
given as follows:
 n 

zl = fl wil yi l = 1, 2, ..., t (4) 2.3. Seasonal autoregressive integrated moving average
i=0
The SARIMA method, which is one of the most popular time
where fl (l = 1, 2, ..., t) is the activation function, which is usually
series forecasting methods, was introduced by Box and Jenk-
defined as a linear function. All weights are assigned with random
values initially, and are modified by the delta rule according to the 
ins [45]. Generally  speaking, it is assumed that the time series
Zt |t = 1, 2, ..., k has mean zero. A non-seasonal ARIMA model
learning samples traditionally.
of order (denoted by ARIMA) representing the time series can be
expressed as:
2.2. Adaptive network based fuzzy inference system
Zt = ϕZt−1 + ϕ2 Zt−2 + ... + ϕp Zt−p + a1 − 1 at−1
The Adaptive Network Based Fuzzy Inference System (ANFIS) − 2 at−2 − ... − q at−q (11)
which integrates the merits of fuzzy systems and neural networks
were proposed by Jang [43]. Using the neural network learning or
mechanism, it automatically extracts rules from input and out-
ϕ(B)∇ d Zt = (B)at (12)
put sample data, and thus constitutes a self-adaptive neural fuzzy
controller. Through an off-line training and the online learning where Zt and at are the actual value and random error at time,
algorithm, it creates a fuzzy reasoning and controls the self-tuning respectively; ϕt and  t are the coefficients, p is the order of
of the rules, which makes the system itself move in a self-adaptive, autoregressive, q is the order of moving average polynomials, B
self-organizing and self-learning direction [44]. In Fig. 2, the ANFIS denotes the backward shift operator, d = (1 − B)d , d is the order
architecture is shown. From Fig. 2, it can be see that the ANFIS of regular differences, and ϕ(B) and (B) are defined as ϕ(B) =
network comprises five layers. Each layer contains several nodes 1 − ϕ1 B − ϕ2 B2 − ... − ϕp Bp and (B) = 1 − 1 B − 2 B2 − ... − q Bq
described by the node function. respectively. Particularly, it is assumed that at is the independent
Layer 1 and identically distributed normal random variables with mean
The first layer of this architecture is the fuzzy layer. Each node zero and variance  2 , and the roots of ϕ(Z) = 0 and (Z) = 0 all
of this layer produces the membership grade of a fuzzy set. Every lie outside the unit circle [46]. In the same way, a seasonal model
node in this layer is an adaptive node with a node function. (using the second expression) can be written as:
O1i = Ai (x) (5) ϕ(B)˚(BS )∇ d (1 − BS )Zt = (B)(Bs )at (13)
Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675 667

where ˚(BS ) = 1 − ˚1 BS − ˚2 B2S − · · · − ˚p BpS , (BS ) = 1 − 1 BS − 3. Combined forecasting method


2 B2S − · · · − q BqS , D is the number of seasonal differences, S is the
period. If the time series is with mean  = / 0, Zt should be replaced 3.1. Theory of the combined forecasting method
with Zt − .
A widely discussed topic is how to combine the existing tech-
niques so as to achieve perfect forecasting results. Armstrong’s
2.4. Differential evolution meta-analysis indicates that combining various techniques is more
useful for a short-term problem forecasting [52]. The combined
The Differential Evolution (DE) algorithm is a kind of evolution- forecasting theory states that if there exist n kinds of forecasting
ary algorithm, which was first proposed by Storn and Price in 1995 techniques for solving a certain forecasting problem, with prop-
[47]. DE and other evolutionary algorithms common ground is that erly allocated weight coefficients, several techniques’ forecasting
they are all a population-based algorithms covering the procedure results can be added up. Assume that yt (t = 1, 2, ..., m) is the
below: crossover, mutation and selection. Among the basic pro- actual time series data, fit (i = 1, 2, ..., n) is the forecasting value
cedure of DE, the mutation process and the selection step differs at time t, and wi is the weight coefficient for the i-th forecasting
from other evolutionary algorithms [48,49]. Compared with other method. Then the forecasting value of the combined method can
evolutionary algorithms, DE has the following superiority: simple be expressed as [53,54]:
structure, fast convergence, ease of use, speed and robustness [50].
n
Consequently, the DE algorithm is regarded as a potential algorithm ŷt =  ω̂i fit , t = 1, 2, . . ., m (18)
for solving the optimization problems [51]. The specific procedures i=1
are shown in the following: 
n
Step 1. Initialization ωi = 1 (19)
The DE population consists of N vectors, and D is the length of i=1
the chromosome. Initially, generate an N ∗ D matrix with uniform
probability distribution random values. For the ith vector Xi , it is The forecasting error of the combined method is written as follows:
generated as follows: 
n 
n 
n 
n

et = yt − ŷt = ωi yt − ωi fit = ωi (yt − fit ) = ωi eit (20)


Xi,j = low[j] + rand · (high[j] − low[j]) (14) i=1 i=1 i=1 i=1

The combined method used in this paper combining the technique


in which, i = 1, 2, ..., N, j = 1, 2, ..., D, rand is a random number
BP neural network, ANFIS and diff-SARIMA. Then the forecasting
with a uniform probability distribution, and high[j], low[j] is the
value in Eq. (18) became
upper bound and lower bound of the jth column, respectively.
Step 2. Mutation ŶCombined(t) = ω1 ŶBP(t) + ω2 ŶANFIS(t) + ω3 ŶSARIMA , t = 1, 2, ..., m
After initialization, the mutation operation is used to gener-
ate the mutant vector Vi for each target vector Xi in the current (21)
population. where ŶCombined(t) , ŶBP(t) , ŶANFIS(t) and ŶCombined(t) are the forecasting
results at period t by the combined method, BP neural network,
Vi = Xr1 + F · (Xr2 − Xr3 ) (15) ANFIS and SARIMA respectively, and ωi (i = 1, 2, 3) is the weight
  coefficient allocated to BP, ANFIS and diff-SARIMA technologies
3
in which, F is the mutation factor, r1 , r2 , r3 ∈ 1, 2, ..., N are ran- respectively with ω = 1, 0 ≤ ωi ≤ 1.
i=1 i
domly chosen and should keep different from each other, which
means that r1 = / r2 =
/ r3 =
/ i. 3.2. Determining the weights in the combined method
Step 3. Crossover
After the mutation, DE applies the crossover procedure which Determining the weight coefficients of each technology in the
is used to increase the diversity of the current population, and combined method is a key issue for the combined method in order
the main purpose of this procedure is to produce the trial vector to forecast a certain problem. A combined method with appropri-
Ui between Xi and Vi . The most commonly used operator is the ate weight coefficients can achieve good forecasting results. On the
binomial or uniform crossover performed on each component as contrary, the forecasting results of a combined method with inap-
follows: propriate parameters are worse than the forecasting results of a
single method. The simplest and most obvious method is equally
Vi,j , if (rand(j) ≤ C or j = jrand )
Ui,j = (16) distributing the weight coefficients in the combined method, just
Xi,j , otherwise like ω in Eq. (19), ω1 = ω2 = ω3 = 13 is allocated. However, in most
cases, the equal weight coefficients cannot generate the desired
in which, C is the crossover rate, and jrand is a randomly chosen forecasting results, and therefore in this paper, the DE technology
index within [1, D]. was applied to determine the three weight coefficients in the com-
Step 4. Selection bined method. Fig. 3 shows the flow chart of the proposed combined
Finally, to keep the population size constant in the following method.
generations, the selection operation was employed to determine
whether the trial or the target vector survives to the next gen- 4. Forecasting statistical metrics
eration. In DE, the one-to-one tournament selection was used as
follows: There are three commonly used metrics which assess the per-
formance of forecasting methods. Each of them can evaluate the
Ui , if f (Ui ) ≤ f (xi ) method from different aspects. They are MAE (Mean Absolute
Xi = (17) Error), MAPE (Mean Absolute Error) and RMSE (Root Mean Square
Xi , otherwise Error). MAE is applied to get the average absolute forecasting errors
while RMSE is used to get the average of the forecasting error
where f (x) is the objective function to be optimized. squares firstly, and then calculate the square root of the results.
668 Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675

Fig. 4. The process of training dataset and testing dataset selection.

means each day had 48 observations and 336 values that were the
observations for the whole week which were aimed to forecast.
In order to forecast easily and take advantage of the similar prop-
erties, the data pre-analysis demarcates the initial data according to
the week into seven groups. All the Mondays were put into the first
subset, all the Tuesdays were put into the second subset and the
Fig. 3. Flowchart of the combined method.
corollary. Training dataset and testing dataset selection are shown
in Fig. 4. From Fig. 4, it can be seen that 1 to k − 1 from the first sub-
MAPE is the criteria which is to assess the accuracy of the result. set can be chosen to model fitting and training, 2 to k from the same
These metrics can be calculated as follows: subset can be chosen to forecast by using the constructed model.
After several trainings, the model became stable enough to adjust
1 2
N to different factors and cases. For example: the electrical power
RMSE = xi − x̂i (22) data from the first Monday in 2nd May 2011 to the last Monday
N
i=1 before July 3, 2011 were put in the first subset. Using all the first
Monday to forecast, the next Monday could be attained. After sev-
1
N eral experiments and simulations, the final results were obtained.
MAE = |xi − x̂i | (23) In this paper, the whole 8 weeks electricity demand data was used
N
i=1 to forecast the following week on the same day.
The following sections mainly analysis the four methods. By
 
1  xi − x̂i 
N
simulating all the methods with the same dataset, the results can be
MAPE =
N
 x  × 100% (24) compared. Sections 5.2, 5.3 and 5.4 display the forecasting results
i
i=1 of BP, ANFIS and diff-SARIMA respectively. Section 5.5 states the
weight coefficients of the three individual methods. Finally Section
where, xi represents the actual value and x̂i represents the fore-
5.6 displays the comparisons between the combined method and
casted value. The results indicate that the smaller they are, the more
the four individual methods.
accurate the method will be.

5.2. The forecasting results of BP


5. Analysis
As far as is known, the number of the hidden layer neurons
To achieve accurate results, for the same method, masses of can directly affect the forecasting results of BP and best choice
different data have been utilized to train and test the model. In can achieve better forecasting results. After several experiments
order to attest to the effectiveness and accuracy of the combined and testing, the number of hidden layer neurons was set to 5 and
method, several experiments were carried out and the optimal at the same time the number of output layer neurons was set to
weight values were chosen. By means of comparing the combined 1. The other parameter values were set: the number of iterating
method with other three methods, namely BP, ANFIS, diff-SARIMA, was 1000, the learning velocity was 0.1 and the target precision
the result data firmly confirmed the advantages of the combined was 0.00004. The forecasting results showed that these parameters
method. setting performs better compared with other parameter values.
Three typical criteria results of BP (MAE, MAPE and RMSE) are
5.1. Data pre-analysis presented in Fig. 5 Part A. These bar graphs intuitively show the
value of the results. It can be seen that BP did well on Tuesday,
As is well known, people’s work, life, commercial and industrial Wednesday, Saturday and Sunday. Four days’ good results partly
activities have some kind of comparability on the same day of the convinced that BP was good at forecasting when using this dataset.
week in different weeks, which results in the consumption of the Part B displays the forecasting values on Wednesday which is the
electricity having the same comparability as well. Because of the day that chosen randomly. From the curve, it can be recognized
similar properties, the electricity demand data can be divided into directly the difference between the actual data and forecasted data.
groups and thus the results can be more proper. And the two curves are almost overlap. Table 1 are the specific data
As a case study, the electricity demand data from 2nd May 2011 results of the three typical criteria, from Table 1, the highest value,
to 3rd July 2011 in New South Wales were selected. These data lowest value and average value of MAE, MAPE and RMSE can easily
were obtained every half hour from 00:00 to 23:30 per day that be observed.
Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675 669

Fig. 5. The forecasting results of BP.

Table 1
Three statistical metrics of BP in a week.

Mon Tue Wen Thu Fri Sat Sun Week

MAE 947.827 294.073 307.035 802.427 1055.58 139.650 504.072 528.356


MAPE (%) 9.601 3.020 2.947 8.898 9.711 1.597 1.597 5.315
RMSE 1012.60 334.641 340.268 856.592 1095.34 164.221 542.335 565.515

Table 2
Three statistical metrics of ANFIS in a week.

Mon Tue Wen Thu Fri Sat Sun Week

MAE 315.377 195.045 299.003 230.136 196.388 340.176 335.067 273.027


MAPE (%) 3.122 2.061 2.942 2.358 1.999 3.790 3.857 2.876
RMSE 411.768 217.696 352.229 296.770 264.709 372.587 363.579 325.620

5.3. The forecasting results of ANFIS method. All the results of MAPE were not more than 4% and also
not less than 1%, and the best result was 1.999%.
Fig. 6 Part A presents the three typical results of ANFIS. From
bar graphs it can be seen that ANFIS had comparatively high val-
ues of MAPE on Monday, Wednesday, and Saturday and Sunday,
the highest value was almost 4%, however, the rest of days had 5.4. The forecasting results of diff-SARIMA
comparatively low values which were under 2.1%. Meanwhile, the
whole figures in Fig. 6 Part A show that the results of ANFIS fore- From Fig. 7 Part A, the value of MAPE on Tuesday was the lowest,
casting are average, neither too high nor too low. In order to display and the other days were almost the same. The curves in Fig. 7 Part B
the forecasting details of ANFIS, the electricity demand values of 48 look smooth and there exists a slight difference between the actual
observations on Wednesday are presented in Fig. 6 Part B. At the data and forecasting data. It can be seen that the forecasting values
time of 0:00–4:00 and 16:00–18:00, the results attained from ANFIS were always higher than the actual values on Wednesday. Three
were close to the actual values, while at other times bigger errors typical results of diff-SARIMA are displayed in Table 3 and the result
always exist. Table 2 shows the real results that measure the ANFIS of MAPE was 1.545% on Tuesday and over 3% on the other days.
670 Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675

Fig. 6. The forecasting results of ANFIS.

Table 3
Three statistical metrics of diff-SARIMA in a week.

Mon Tue Wen Thu Fri Sat Sun Week

MAE 441.194 147.019 411.315 346.598 361.705 486.873 530.324 389.290


MAPE (%) 4.741 1.545 4.051 3.418 3.543 5.208 5.922 4.061
RMSE 540.778 169.799 422.462 384.332 414.669 506.207 549.440 426.812

5.5. The weight coefficients of the combined method different affect on the combined method. That demonstrates the
combined method is superior to the three individual methods.
The method being proposed is a novel method which combined
three different methods. All the three methods’ weight coefficients
5.6. Comparisons between the combined method and other
in a week are presented in Table 4. From Table 4, it can be seen
individual methods
that each day had disparate weight coefficients. Different weight
coefficients mean different effect degrees, the higher the weight
In this section, for the sake of certifying the forecasting ability of
coefficients are, the greater influence it will be. For example, on
the combined method, it was contrasted with three other individual
Saturday, weight coefficient of ANFIS was bigger than the other two
methods (BP, ANFIS and diff-SARIMA). Fig. 8 shows the prediction
methods which means the ANFIS affected the combined method on
values for each day of all the four methods and each color stands for
that day. Seen from Table 4, this combined method took advantage
different methods. As seen from Fig. 8, the curve of the combined
of the three individual methods. Each method on each day had a
method was more consistent with the actual data, which means the
combined method outperformed the individual methods. However,
Table 4 among the three individual methods, it can be roughly seen that
The weight coefficients of the combined method. the BP curve was less consistent with the actual data on Monday,
Thursday and Friday while the ANFIS and diff-SARIMA curves were
BP ANFIS diff-SARIMA
more consistent with the actual data on Tuesday, Wednesday and
Monday 0.393123 0.33988 0.343686 Saturday. However, the diff-SARIMA curve was less consistent with
Tuesday 0.378311 0.239307 0.384519
Wednesday 0.236525 0.347067 0.394573
the actual data on Monday, Thursday and Friday and the difference
Thursday 0.073223 0.0346 0.871758 between the actual data and forecasting data was bigger.
Friday 0.326001 0.500754 0.13219 Table 5 shows the calculated values of three statistical metrics
Saturday 0.009652 0.835712 0.150481 and the average values of the four methods. More detailed infor-
Sunday 0.456883 0.423969 0.073735
mation is revealed in Table 6 that consists of values of the four
Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675 671

Fig. 7. The forecasting results of diff-SARIMA.

Fig. 8. Final forecasted values for each day by four methods.


672 Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675

Table 5
Three statistical critics of four methods.

Date RMSE of MAE of MAPE(%)of

BP diff-SARIMA ANFIS Combined BP diff-SARIMA ANFIS Combined BP diff-SARIMA ANFIS Combined

Monday 1012.603 540.778 411.768 431.770 947.827 441.194 315.377 334.757 9.601 4.741 3.122 3.275
Tuesday 334.641 169.799 217.696 99.038 294.073 147.019 195.045 82.372 3.020 1.545 2.061 0.827
Wednesday 340.268 422.462 352.229 157.782 307.035 411.315 299.003 136.309 2.947 4.051 2.942 1.400
Thursday 856.592 384.332 296.770 162.265 802.427 346.598 230.136 122.167 8.898 3.418 2.358 1.242
Friday 1095.342 414.669 264.709 263.800 1055.578 361.705 196.388 219.549 9.711 3.543 1.999 2.226
Saturday 164.221 506.207 372.587 170.185 139.650 486.873 340.176 139.984 1.597 5.208 3.790 1.639
Sunday 542.335 549.440 363.579 110.335 504.072 530.324 335.067 81.593 5.601 5.922 3.857 0.969
Whole week 565.515 426.812 325.620 199.311 528.356 389.290 273.027 159.533 5.315 4.061 2.876 1.654

Table 6
Actual and forecasting values forecasted by four methods on Wednesday.

Time (h) Actual Forecasted value(MW) of Time (h) Actual Forecasted value(MW) of
value(MW) value(MW)

BP Diff-SARIMA ANFIS Combined BP Diff-SARIMA ANFIS Combined


model model

0:00 9075.54 9075.229 9396.926 9388.197 9078.028 12:00 9835.35 10237.26 10259.48 10071 9983.172
0:30 8852.95 8860.05 9184.872 9124.196 8857.232 12:30 9810.23 10211.02 10248.29 9983.349 9948.918
1:00 8711.13 8657.702 8954.259 8961.677 8657.571 13:00 9793.52 10151.83 10234.31 10011.44 9929.5
1:30 8508.96 8356.377 8676.238 8577.358 8352.39 13:30 9756.38 10137.14 10229.14 9969.884 9912.536
2:00 8144.21 7971.671 8306.902 8100.33 7960.313 14:00 9743.84 10139.62 10239.42 9925.841 9907.031
2:30 7773.2 7600.422 7962.122 7779.599 7619.563 14:30 9766.08 10099.98 10176.89 9945.684 9873.298
3:00 7498.6 7336.428 7738.948 7520.544 7378.608 15:00 9808.06 10128.32 10224.44 9964.103 9906.252
3:30 7306.29 7202.014 7575.665 7421.331 7244.063 15:30 9960.08 10297.97 10389.67 10073.45 10056.19
4:00 7228.83 7145.443 7542.275 7418.594 7210.608 16:00 10236.11 10582.18 10616 10328.09 10304.36
4:30 7288.78 7274.426 7630.468 7475.585 7303.651 16:30 10685.9 11068.91 11017.19 10802.27 10743.75
5:00 7555.85 7545.053 7956.35 7721.507 7584.328 17:00 11297.21 11791.95 11724.87 11311.4 11394.34
5:30 8025.08 8081.051 8393.846 8307.872 8081.669 17:30 11713.4 12218.42 12237.19 11335.49 11750.2
6:00 8844.56 9024.524 9264.007 9481.52 9030.057 18:00 11705.29 12168.57 12155.95 11333.26 11700.32
6:30 9643.28 9912.546 10084.56 10190.35 9829.684 18:30 11569.77 12011.77 12008.51 11336.38 11588.46
7:00 10058.81 10427.97 10515.63 10637.54 10284.43 19:00 11285.7 11723.37 11736.86 11338.23 11381.62
7:30 10470.08 10863.09 10900.37 11049.03 10684.58 19:30 11051.7 11509.57 11559.16 11324.44 11234.04
8:00 10554.39 10928.77 10953.98 11070.16 10733.53 20:00 10853.03 11285.09 11373.49 11297.21 11076.43
8:30 10435.16 10781.53 10838.19 10877.77 10591.23 20:30 10585.16 10996.06 11141.26 11207.94 10863.37
9:00 10406.96 10792.66 10831.42 10781.92 10569.75 21:00 10252.17 10700.97 10819.81 10950.77 10573.29
9:30 10344.68 10771.34 10787.81 10655.88 10515.33 21:30 9887.65 10329.16 10450.92 10518.93 10196.55
10:00 10279.92 10690.77 10715.47 10554.61 10434.87 22:00 9862.22 10269.69 10378.09 10396.27 10118.16
10:30 10151.66 10590.24 10627.21 10405.53 10329.9 22:30 9544.33 9867.452 10066.39 10045.9 9772.699
11:00 10009.7 10397.01 10486.91 10285.42 10179.07 23:00 9370.54 9626.381 9834.738 9837.936 9548.438
11:30 9912.55 10269.94 10325.88 10207.75 10053.06 23:30 9214.94 9445.299 9640.604 9664.001 9367.85

methods on Wednesday. From Table 5, it can be seen that each bined method. In general, taking into account the average values
method has the best accuracy at some specific time point. When of the whole week, the combined method had lower values than
making a comparison between the three separate methods and the the three individual methods. That is to say the combined method
combined method, the results were as follows: outperformed the other methods.
Combined method vs. BP: three parameters RMSE, MAE and In order to contrast the four methods in detail, 48 observations
MAPE of the combined method had lower values than BP except on on Wednesday were taken as an illustration. From Fig. 9, it can
Saturday. Though MAPE of the combined method was a bit higher be seen that BP was as effective as the combined method before
than BP, but other days’ were lower. In general, taking the average 9:00, which had a higher accuracy. While after 9:00, the differences
value into account, the combined method performed better than between actual data and forecasting data became greater. Although
BP. The combined method had reduced RMSE by 64.8%, MAE by ANFIS outperformed better between 15:00 and 18:00, it couldn’t
69.8% and MAPE by 68.9%. match the combined method in the whole day. On Wednesday,
Combined method vs. diff-SARIMA: three parameters RMSE, the forecasting results of diff-SARIMA were always lower than the
MAE and MAPE of the combined method had lower values every actual data. That indicates diff-SARIMA cannot predict well on that
day than diff-SARIMA. That means diff-SARIMA was not as good as day. When taking a look at the results of the combined method, the
the combined method. The combined method had reduced RMSE two curves on Wednesday were almost becoming the same curve.
by 53.3%, MAE by 59.0% and MAPE by 59.3%. This totally demonstrated that the combined method was superior
Combined method vs. ANFIS: three parameters RMSE, MAE and to the other three individual ones.
MAPE of the combined method had lower values than ANFIS except Furthermore, to display the accuracy of every separate method
on Monday. Though on Monday, MAE and MAPE of ANFIS were vividly, Fig. 10 shows the errors which were the difference between
lower than the combined method. In terms of the average values, forecasting values and actual values of all the four methods every-
ANFIS was not as good as the combined method that had reduced day. As shown in Fig. 10, the combined method had the smallest
RMSE by 38.8%, MAE by 41.7% and MAPE by 42.5%. errors that were close to zero for the whole week. Meanwhile,
All in all, BP performed better on Saturday than the combined the other three single methods had the smaller errors merely on
method and ANFIS was better on Monday and Friday than the com- some day and had bigger errors most days. Taking Wednesday as
Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675 673

Fig. 9. Forecasted values by four methods on Wednesday.

Fig. 10. Errors for each day by four methods.

an example: errors of the combined method were almost zero and forecasting will considerably decrease electricity consumption. The
the combined method curve was as smooth as a line, which means combined method takes advantage of each single method, which
the forecasting values of the combined method were close to the makes the combined method superior to other separate methods.
actual values. Errors of diff-SARIMA were almost a line, but all the As a result, the combined method possesses high accuracy and sta-
errors were above zero. The curve of ANFIS obviously fluctuated up bility. Meanwhile, through using the DE optimization algorithm,
and down. All in all, the combined method had the smallest errors the proposed method can be significantly optimal. MAPE of the
among all the methods. That demonstrates the combined method combined method was 1.654% that was much lower than BP, ANFIS
was superior to the three individual methods. and diff-SARIMA. In a nutshell, the aforementioned comparison
Because forecasting errors will increase the burden to electri- results confirm that the combined method outperformed the other
cal production costs, increasing the accuracy of electricity demand three individual methods.
674 Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675

6. Conclusions [8] Y. Wang, J. Wang, G. Zhao, et al., Application of residual modification


approach in seasonal ARIMA for electricity demand forecasting: a case study
of China, Energy Policy 48 (2012) 284–294.
In recent years, electricity demand forecasting techniques with [9] A. Khosravi, S. Nahavandi, D. Creighton, et al., Interval type-2 fuzzy logic
high accuracy and effectiveness have attracted more and more systems for load forecasting: a comparative study, IEEE Trans. Power Syst. 27
attention from experts to companies. Effective forecasting methods (3) (2012) 1274–1282.
[10] S. Kucukali, K. Baris, Turkey’s short-term gross annual electricity demand
can save a lot of energy, money and reduce multiple risks. However, forecast by fuzzy logic approach, Energy Policy 38 (5) (2010) 2438–2445.
not many methods can really be applied in actual forecasting areas. [11] P.K. Dash, A.C. Liew, S. Rahman, et al., Building a fuzzy expert system for
Moreover, the prediction accuracy has not been that perfect and the electric load forecasting using a hybrid neural network, Expert Syst. Appl. 9
(3) (1995) 407–421.
existing single methods are imperfect to forecast all kinds of data.
[12] Z. Yongli, B.W. Hogg, W.Q. Zhang, et al., Hybrid expert system for aiding
The single methods can only deal with either linearity data or non- dispatchers on bulk power systems restoration, Int. J. Electr. Power Energy
linearity data and cannot deal with both. Some combined methods Syst. 16 (4) (1994) 259–268.
[13] P. Zhou, B.W. Ang, K.L. Poh, A trigonometric grey prediction approach to
can improve the accuracy a little but not much. Therefore, devel-
forecasting electricity demand, Energy 31 (14) (2006) 2839–2847.
oping higher accuracy electricity demand forecasting methods is [14] D. Akay, M. Atak, Grey prediction with rolling mechanism for electricity
highly desirable. demand forecasting of Turkey, Energy 32 (9) (2007) 1670–1675.
In this paper, a novel combined method that consists of three [15] K. Kandananond, Forecasting electricity demand in Thailand with an artificial
neural network approach, Energies 4 (8) (2011) 1246–1257.
single methods (BP, ANFIS and diff-SARIMA) was proposed. The [16] A.P. Plumb, R.C. Rowe, P. York, et al., Optimisation of the predictive ability of
combined method can eliminate single methods’ drawbacks and artificial neural network (ANN) models: a comparison of three ANN programs
incorporate in their merits, so it is no surprise that the com- and four classes of training algorithm, Eur. J. Pharm. Sci. 25 (4) (2005)
395–405.
bined method was superior to the conventional individual methods [17] S.M. Al-Alawi, S.A. Abdul-Wahab, C.S. Bakheit, Combining principal
because BP and ANFIS can deal with the linearity data and diff- component regression and artificial neural networks for more accurate
SARIMA can deal with the nonlinearity data. The combined method predictions of ground-level ozone, Environ. Model. Softw. 23 (4) (2008)
396–403.
selects these methods so that it can cope with all the linear- [18] A. Azadeh, M. Saberi, A. Gitiforouz, et al., A hybrid simulation-adaptive
ity, nonlinearity and seasonality data. Furthermore, because of network based fuzzy inference system for improvement of electricity
the combination of the three individual methods, the combined consumption estimation, Expert Syst. Appl. 36 (8) (2009) 11108–11117.
[19] A. Khajeh, H. Modarress, B. Rezaee, Application of adaptive neuro-fuzzy
method can effectively forecast both small sample problems and
inference system for solubility prediction of carbon dioxide in polymers,
big sample problems. Thus the combined method has extensive Expert Syst. Appl. 36 (3) (2009) 5728–5732.
applicability. Moreover, when comparing with the existing com- [20] T.Y. Pai, T.J. Wan, S.T. Hsu, et al., Using fuzzy inference system to improve
neural network for predicting hospital wastewater treatment plant effluent,
bined method named MFES that was proposed by Chen et al. [55],
Comput. Chem. Eng. 33 (7) (2009) 1272–1278.
the MAPE of the combined method (1.654) proposed in this paper [21] Z. Xiao, S.J. Ye, B. Zhong, et al., BP neural network with rough set for short
was lower than ESPLSSVM (1.79) in the whole week. The pro- term load forecasting, Expert Syst. Appl. 36 (1) (2009) 273–279.
posed combined method had reduced RMSE by 1.83%, MAE by [22] C.S.N. Azwadi, M. Zeinali, A. Safdari, et al., Adaptive-network-based fuzzy
inference system analysis to predict the temperature and flow fields in a
5.49% and MAPE by 7.6% when contrasted with ESPLSSVM. That lid-driven cavity, Numer. Heat Transf. A 63 (12) (2013) 906–920.
means this method is preferable to ESPLSSVM, since the forecasted [23] John M. Bates, Clive W.J. Granger, The combination of forecasts, J. Oper. Res.
values were closer with the actual values. It can automatically Soc. 20 (4) (1969) 451–468.
[24] T. Xiong, Y. Bao, Z. Hu, Multiple-output support vector regression with a
deal with different cases and does not need complicated decisions firefly algorithm for interval-valued stock price index forecasting, Knowledge
about the model. Another advantage that this method owns is that Based Syst. 55 (2014) 87–100.
weight coefficients are allocated to each separate methods by using [25] S. Yu, K. Zhu, S. Gao, A hybrid MPSO-BP structure adaptive algorithm for
RBFNs, Neural Comput. Appl. 18 (7) (2009) 769–779.
an optimization algorithm named DE. It can balance each single [26] Z. Tan, J. Zhang, J. Wang, et al., Day-ahead electricity price forecasting using
method and choose the best weight coefficients in order to combine wavelet transform combined with ARIMA and GARCH models, Appl. Energy
the optimal method. So, it’s clear that the proposed method is better 87 (11) (2010) 3606–3610.
[27] K. Xie, H. Zhang, C. Singh, Reliability forecasting models for electrical
than most of the single methods and some of the existing methods
distribution systems considering component failures and planned outages,
for electricity demand forecasting and has higher accuracy. Int. J. Electr. Power Energy Syst. 79 (2016) 228–234.
[28] A. Khashman, A modified backpropagation learning algorithm with added
emotional coefficients, IEEE Trans. Neural Netw. 19 (11) (2008) 1896–1909.
Acknowledgments [29] M. Blanchard, G. Desrochers, Generation of autocorrelated wind speeds for
wind energy conversion system studies, Sol. Energy 33 (6) (1984) 571–579.
[30] M.P. Clements, D.F. Hendry, Forecasting economic processes, Int. J. Forecast.
The authors would like to thank the Natural Science Founda- 14 (1) (1998) 111–131.
tion of PR of China (61073193, 61300230), the Key Science and [31] Y.H. Liang, Combining seasonal time series ARIMA method and neural
networks with genetic algorithms for predicting the production value of the
Technology Foundation of Gansu Province (1102FKDA010), the
mechanical industry in Taiwan, Neural Comput. Appl. 18 (7) (2009) 833–841.
Natural Science Foundation of Gansu Province (1107RJZA188), and [32] T. Koutroumanidis, K. Ioannou, G. Arabatzis, Predicting fuelwood prices in
the Science and Technology Support Program of Gansu Province Greece with the use of ARIMA models, artificial neural networks and a hybrid
ARIMA-ANN model, Energy Policy 37 (9) (2009) 3627–3634.
(1104GKCA037) for supporting this research.
[33] R.G. Kavasseri, K. Seetharaman, Day-ahead wind speed forecasting using
f-ARIMA models, Renew. Energy 34 (5) (2009) 1388–1393.
[34] D. Petković, M. Issa, N.D. Pavlović, et al., Adaptive neuro-fuzzy estimation of
References conductive silicone rubber mechanical properties, Expert Syst. Appl. 39 (10)
(2012) 9477–9482.
[1] D. Bunn, Comparative Models for Electrical Load Forecasting, 1985. [35] A. El-Shafie, O. Jaafer, S.A. Akrami, Adaptive neuro-fuzzy inference system
[2] L. Xiao, J. Wang, R. Hou, et al., A combined model based on data pre-analysis based model for rainfall forecasting in Klang River: Malaysia, Int. J. Phys. Sci. 6
and weight coefficients optimization for electrical load forecasting, Energy 82 (12) (2011) 2875–2888.
(2015) 524–549. [36] A. Khajeh, H. Modarress, B. Rezaee, Application of adaptive neuro-fuzzy
[3] A. Goia, C. May, G. Fusai, Functional clustering and linear regression for peak inference system for solubility prediction of carbon dioxide in polymers,
load forecasting, Int. J. Forecast. 26 (4) (2010) 700–711. Expert Syst. Appl. 36 (3) (2009) 5728–5732.
[4] V. Bianco, O. Manca, S. Nardini, Linear regression models to forecast electricity [37] L. Ke, G. Wenyan, S. Xiaoliu, et al., Research on the forecast model of
consumption in Italy, Energy Sources B 8 (1) (2013) 86–93. electricity power industry loan based on GA-BP neural network, Energy
[5] V. Bianco, O. Manca, S. Nardini, Electricity consumption forecasting in Italy Procedia 14 (2012) 1918–1924.
using linear regression models, Energy 34 (9) (2009) 1413–1421. [38] J. Wang, S. Zhu, W. Zhao, et al., Optimal parameters estimation and input
[6] R. Dong, W. Pedrycz, A granular time series approach to long-term forecasting subset for grey model based on chaotic particle swarm optimization
and trend forecasting, Physica A 387 (13) (2008) 3253–3270. algorithm, Expert Syst. Appl. 38 (7) (2011) 8151–8158.
[7] E. Erdogdu, Electricity demand analysis using cointegration and ARIMA [39] B. Chen, L. Chen, Y. Chen, Efficient ant colony optimization for image feature
modelling: a case study of Turkey, Energy policy 35 (2) (2007) 1129–1146. selection, Signal Process. 93 (6) (2013) 1566–1576.
Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675 675

[40] A. Azadeh, S.F. Ghaderi, S. Sohrabkhani, Annual electricity consumption [49] A.A.A. El Ela, M.A. Abido, S.R. Spea, Optimal power flow using differential
forecasting by neural network in high energy consuming industrial sectors, evolution algorithm, Electr. Eng. 91 (2) (2009) 69–78.
Energy Convers. Manage. 49 (8) (2008) 2272–2278. [50] B.V. Babu, S.A. Munawar, Differential evolution strategies for optimal design
[41] S. Haykin, N. Network, A comprehensive foundation, Neural Netw. 2004 of shell-and-tube heat exchangers, Chem. Eng. Sci. 62 (14) (2007) 3720–3739.
(2004) 2. [51] M.H. Khademi, M.R. Rahimpour, A. Jahanmiri, Differential evolution (DE)
[42] D.Ö. Faruk, A hybrid neural network and ARIMA model for water quality time strategy for optimization of hydrogen production, cyclohexane
series prediction, Eng. Appl. Artif. Intell. 23 (4) (2010) 586–594. dehydrogenation and methanol synthesis in a hydrogen-permselective
[43] J.S.R. Jang, Self-learning fuzzy controllers based on temporal membrane thermally coupled reactor, Int. J. Hydrogen Energy 35 (5) (2010)
backpropagation[, IEEE Trans. Neural Netw. 3 (5) (1992) 714–723. 1936–1950.
[44] M. Liu, Y.Y. Ling, Using fuzzy neural network approach to estimate [52] J.S. Armstrong, Combining forecasts: the end of the beginning or the
contractors’ markup, Build. Environ. 38 (11) (2003) 1303–1308. beginning of the end? Int. J. Forecast. 5 (4) (1989) 585–588.
[45] G.E.P. Box, G.M. Jenkins, G.C.et al. Reinsel, Time Series Analysis: Forecasting [53] C. Christodoulos, C. Michalakelis, D. Varoutas, Forecasting with limited data:
and Control, John Wiley & Sons, 2015. combining ARIMA and diffusion models, Technol. Forecast. Soc. Change 77 (4)
[46] F.M. Tseng, H.C. Yu, G.H. Tzeng, Combining neural network model with (2010) 558–565.
seasonal time series ARIMA model, Technol. Forecast. Soc. Change 69 (1) [54] S. Gupta, P.C. Wilton, Combination of forecasts: an extension, Manage. Sci. 33
(2002) 71–87. (3) (1987) 356–372.
[47] R. Storn, K. Price, Differential Evolution-a Simple and Efficient Adaptive [55] Y. Chen, Y. Yang, C. Liu, et al., A hybrid application algorithm based on the
Scheme for Global Optimization over Continuous Spaces, ICSI, Berkeley, 1995. support vector machine and artificial intelligence: an example of electric load
[48] D.Y. Sha, C.Y. Hsu, A new particle swarm optimization for the open shop forecasting, Appl. Math. Model. 39 (9) (2015) 2617–2632.
scheduling problem, Comput. Oper. Res. 35 (10) (2008) 3243–3261.

You might also like