1 s2.0 S1568494616303891 Main
1 s2.0 S1568494616303891 Main
a r t i c l e i n f o a b s t r a c t
Article history: Electricity demand forecasting, as a vital tool in the electricity market, plays a critical role in power
Received 15 January 2016 utilities, which can not only reduce production costs but also save energy resources, thus making the
Received in revised form 7 June 2016 forecasting techniques become an indispensable part of the energy system. A novel combined forecasting
Accepted 31 July 2016
method based on Back Propagation (BP) neural network, Adaptive Network-based Fuzzy Inference System
Available online 3 August 2016
(ANFIS) and Difference Seasonal Autoregressive Integrated Moving Average (diff-SARIMA) are presented
in this paper. Firstly, the combined method uses all the three methods (BP, ANFIS, diff-SARIMA) to forecast
Keywords:
respectively, and the three forecasting results were obtained. By multiplying optimal weight coefficients
Electricity demand forecasting
Combined forecasting method
of the three forecasting results respectively and then adding them up, in the end the final forecasting
BP results can be obtained. Among the three individual methods, BP and ANFIS had the ability to deal with
ANFIS the nonlinearity data, and diff-SARIMA had the ability to deal with the linearity and seasonality data. So
diff-SARIMA the combined method eliminates drawbacks and incorporates in the merits of the individual methods. It
DE has the capability to deal with the linearity, nonlinearity and seasonality data. In order to optimize weight
coefficients, Differential Evolution (DE) optimization algorithm is brought into the combined method. To
prove the superiority and accuracy, the capability of the combined method is verified by comparing it
with the three individual methods. The forecasting results of the combined method proved to be better
than all the three individual methods and the combined method was able to reduce errors and improve
the accuracy between the actual values and forecasted values effectively. Using the half-hour electricity
power data of the State of New South Wales in Australia, relevant experimental case studies showed that
the proposed combined method performed better than the other three individual methods and had a
higher accuracy.
© 2016 Published by Elsevier B.V.
Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664
2. Methodologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665
2.1. BP neural network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665
2.2. Adaptive network based fuzzy inference system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
2.3. Seasonal autoregressive integrated moving average . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
2.4. Differential evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
3. Combined forecasting method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
3.1. Theory of the combined forecasting method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
3.2. Determining the weights in the combined method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
4. Forecasting statistical metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
5. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668
∗ Corresponding author.
E-mail address: [email protected] (Y. Chen).
http://dx.doi.org/10.1016/j.asoc.2016.07.053
1568-4946/© 2016 Published by Elsevier B.V.
664 Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675
1. Introduction system [18–20]. Among all these methods, artificial neural net-
works and fuzzy theories are well received. Due to that they have
As people’s living standard and social-economic level have the capacity to process the nonlinear data. Xiao et al. [21] developed
improved significantly, natural energy consumption continues to a BP neural network with a rough set to forecast the short-term time
grow. Energy shortage becomes increasingly serious, so that more series data, and the results proved the priority that BP had. Azwadi
and more countries have attached great significance to these hot et al. [22] used the ANFIS to predict the temperature and flow fields
issues. As the major impetus to improve the development of soci- in a lid-driven cavity.
ety, electricity, one of the most important energy resources, plays Despite the introduction of artificial intelligence, each of the
a crucial role in the power system. It is known that, electricity is a individual methods still cannot get rid of the fact that none of
resource which is difficult to store; besides, the electric system is them are able to give birth to the desired outcomes because of
affected by various unstable factors, namely weather, population, their disadvantages. For instance, neural networks attain the results
holidays, emergency and more. All the enormous problems make closer to local ones not the global optimal ones. Expert systems
it difficult for the electricity production industry to estimate their excessively rely on knowledge and cannot gain the optimal results
output. Thus, an accurate and precise forecasting method is needed all the time while the grey prediction systems are suitable for
in the electricity market. On the contrary, inappropriate electricity exponential growth models. Thus, by considering every method’s
demand forecasting will be counterproductive. An overestimated merits, taking full advantage of them, the concept of the hybrid
method will increase the workload of the electricity production and combination methods developed rapidly. The thought of the
and dissipate the energy resources. Bunn and Farmer pointed out combined method was first introduced by Bates et al. [23], who
that 10 million dollars of extra costs will increase in operating proved that the combined methods were more efficient and easier
because of the 1% increasing forecasting errors in electricity pro- than the single ones. In addition, hybrid and combined methods
duction [1]. Meanwhile an underestimated method will paralyze aggregate the advantages of multiple separate methods. Because
the electricity grid, and some regions will face power failure sit- of these advantages, hybrid and combined methods are widely
uations. It’s quite clear that a good forecasting method which can used in different applications. Xiong et al. [24] proposed a hybrid
avoid many disasters is the key to respond to the future electric- method based on the support vector regression with a firefly algo-
ity demand. Therefore, precise electricity demand forecasting is the rithm to forecast stock price index. Yu et al. [25] used a novel
prerequisite to meet the demand, no matter whether for developed hybrid structure named MPSO-BP adaptive algorithm by using the
countries or developing countries. No matter for what kind of elec- Radial Basis Function Network (RBFN). Tan et al. [26] developed
tricity demand forecasts, long-term, midterm or others, developing a combined method by using three individual methods namely
a novel method which cannot only be effective but also improve the wavelet transform, ARIMA and Generalized Autoregressive Condi-
forecasting accuracy is a must [2]. tional Heteroscedasticity (GARCH) to forecast the electricity price.
More and more novel methods have been proposed in recent Although hybrid and combined methods have priorities, they
years because of the necessity for better forecasting performance. focus on different aspects. Hybrid methods use a series of methods
For example regression based methods and time series based meth- to process the data in advance, such as: noise reduction, seasonal
ods. Among the regression methods, linear regression methods are adjustment and cluster, while combined methods use weight coef-
the most widely used ones. Goia et al. [3] applied linear regression ficients compared with the individual methods which constitute
method to forecast the peak load. Bianco et al. [4,5] used the linear the hybrid methods. In reference to the merits of these two kinds
regression models to forecast the electricity consumption in Italy. of methods, this paper proposes a novel combined method whose
ARIMA is the most commonly used method in time series mod- advantage is not only linear time series data but also nonlinear
els. For instance, Dong et al. [6] employed the time series approach time series data can be processed. The combined forecasting meth-
to forecast long-term load, Erdogdu et al. [7] analyzed electrical ods are as follows: diff-SARIMA method deals with the linear data,
demand by using ARIMA in Turkey, and Wang et al. [8] proposed while BP method and ANFIS method deal with the nonlinear data.
seasonal ARIMA to forecast electricity demand in China. Because During the parameter optimization process, the DE optimization
of the nonlinear characteristics which the electricity demand time algorithm was utilized to optimize the weight coefficients of the
series apparently experiences, even though these methods (regres- combined methods. These three methods were used separately in
sion based methods, time series based methods) are presently different applications.
mature, they still have drawbacks so that they cannot properly fore- The BP neural network is a multilayer feedforward network
cast the nonlinear load series. Moreover, these methods are affected using the input-output model mapping relation theory which was
by reliability and availability of external factors to a great extent. trained by error back propagation algorithm. By using the steepest
For the last several decades, Artificial Intelligence (AI) methods grades descent or other methodologies to modulate all the weights
have demonstrated the formidable ability in dealing with the sea- and thresholds of the whole network, after several iterations, the
sonal and nonlinear load data. More and more new methodologies least sum of square error can be procured. BP, as one of the most
and techniques have emerged, such as the fuzzy logic system [9,10], traditional models, was welcomed in many applications. Reliability
the expert system [11,12], the grey prediction models [13,14] forecasting models for electrical distribution systems considering
the Artificial Neural Networks (ANN) [15–17], the fuzzy inference component failures and planned outages was used based on BP
Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675 665
by Xie et al. [27]. Khashman [28] proposed a modified BP learning Input layer Hidden layer Output layer
algorithm with added emotional coefficients.
The Autoregressive Integrated Moving Average (ARIMA) fore- x1 z1
casting method is regarded as a traditional time series forecasting
method came into being [29,30]. Its basic thought is that the fore-
casted object which comes into being as time goes by is regarded as x2
a random series. With certain approximate mathematical models to
describe the sequence, the forecasting data can be obtained once zt
the model becomes recognized. The ARIMA method was widely xn
employed in different areas, such as production values area pre-
diction [31], fuel wood prices forecasting [32] and wind speed
wji wil
forecasting [33]. Besides using SARIMA, the difference is also intro-
duced in this paper. The specific procedure was that the difference
Fig. 1. The structure of BP.
operation was done before putting the original data into SARIMA
method. Because of the significant linear trend from the original
data present, diff-SARIMA can level the data off. Through experi- (4) Apply DE to adjust the weight coefficients between the three
mentation, the result outperformed better than the single SARIMA. methods. If the results are within the accuracy limitation, the
ANFIS which brought learning principles into the fuzzy systems iteration stops; otherwise, the weights still should be adjusted
was first presented by Roger Jang, creating a self-adaptive system by DE.
that can imitate human’s recognition. A neural network can train (5) Compare the other three single methods that are mentioned
and generate data autonomously once embedded into a fuzzy struc- above with the proposed combined method.
ture. After amending, the best subordinate function and fuzzy rules
can be highly summarized. ANFIS is commonly used in the process The rest of the paper is organized as follows: Three individ-
of nonlinear data applications. Petković et al. [34] employed ANFIS ual methods including BP, ANFIS and diff-SARIMA and DE optimal
to estimate silicone rubber mechanical properties. In order to fore- algorithm are displayed in Section 2. In Section 3, the com-
cast the rainfall, El-Shafie et al. [35] proposed a model based on bined method is introduced. In Section 4, statistical measures
ANFIS. The ANFIS was also used in the prediction of the solubility of forecasting performance are displayed. Section 5 presents the
of carbon dioxide by Khajeh et al. [36]. comparison between the combined method and other separate
Despite combining these three methods into a novel method a methods. Finally, Section 6 summarizes the whole paper.
problem still exists: the proportion coefficient should be adjusted in
the combined method, so that the results can be the best. To address
the problem and improve the accuracy, some parameter optimized 2. Methodologies
algorithms have been applied in many applications. Ke et al. [37]
applied BP optimized by GA (Genetic algorithms) to forecast the Many methods exist in regard to forecast electricity production
industry loan of electricity power. A grey model based on PSO (Par- and consumption, such as the time series methods, the classical
tial Swarm Optimization) was proposed to optimize parameters statistic methods and the artificial intelligence methods. This paper
by Wang et al. [38]. Chen et al. [39] used an efficient Ant Colony takes advantage of different methods’ merits and proposes a novel
Optimization (ACO) to optimize image feature. combined method consisting of three methods: the BP neural net-
In this paper, DE was chosen to be the evolutionary algorithm. work, ANFIS, and diff-SARIMA.
As with other evolution algorithms such as: GA, PSO, ACO, which
are all part of the swarm intelligence based on stochastic parallel 2.1. BP neural network
algorithms. DE has the special ability to store knowledge memory
in order to keep track of the current search dynamically, thus the The artificial neural network possesses the characteristics of
optimal outcomes can be obtained by adjusting the parameters. self-organization, adaptive and self-learning, as well as non-
Through a comparison of GA, PSO and ACO, DE has the potent abil- linearity, non-locality, non-steady and non-convex and other
ity of global convergence and robustness that can receive the best characteristic, which makes it become a powerful tool to solve com-
results. However, GA and PSO possess too many parameters that plex problems [40]. As a kind of ANN, BP is extensively used in many
greatly influence the results. In real applications, it will increase cases.
the difficulty of the algorithm, especially the high-dimension opti- This research focused on a three-layer BP neural network, and
mization problems. Thus, DE is the better algorithm to adjust the its framework is presented in Fig. 1. Each neuron in the network
parameters. is a node, and the network is composed of an input layer, hid-
Combining BP, ANFIS and diff-SARIMA and adjusting the weight den layer and output layer. Linking with the weight coefficients,
coefficients by DE, it should be assumed that the combined method the front layer and the back layer are connected. It’s notewor-
is able to recognize most of the seasonal, linear and nonlinear pat- thy that the hidden layer can be a layer, as well as can also be
terns. a multilayer. The BP process was made up of two sections which
The detailed procedures for the proposed method are listed were forward and error back propagation. When the BP neural net-
below: work starts learning, the input signal spreads in the input layer,
and then reaches to the hidden layer. After a series of process-
ing, in the end, it comes to the output layer. During the forward
(1) Input original data into BP neural network, after several times propagation, if the expected output can be acquired, the process of
of training and testing, and then get the forecasted data. learning terminates. Otherwise, it will turn to the process named
(2) Gather the time series data by letting the latter data subtract back propagation. Back-propagation is a procedure where the error
the former data in turn to get the new time series, and then put signal (the difference between the output and the network output)
the new time series into the SARIMA method to get the final is backward calculated according to the original connection path
forecasted data. [41]. In order to decrease the error signal, the weights of each neu-
(3) Put the original series into ANFIS, and attain the results. ron were adjusted by the means of gradient descent method. The
666 Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675
Layer 3
Each node N computes the ratio of the ith rule’s firing weight to
the sum of all rule’s firing weights:
Fig. 2. The structure of ANFIS.
wi
O3i = w̄i = , i = 1, 2 (7)
w1 + w2
training process is described by the following two steps to update
these weighted values [42]. The outputs of this layer are called normalized firing strengths.
Layer 4
Compute the contribution of the ith rule to the overall output.
(1) Hidden layer stage: The outputs of all neurons in the hidden
layer are calculated by the following equations: O4i = w̄i fi = w̄i (ai x + bi y + ci ), i = 1, 2 (8)
m where w̄i is the output of layer 3, and ai x + bi y + ci is the param-
neti = wji xj , i = 1, 2, ..., n, (1) eter set.
j=0 Layer 5
The signal node computes the final output as the summation
yi = fH (neti ), i = 1, 2, ..., n. of all incoming signals:
(2)
wf
O5i = w̄i fi = i i i (9)
i
wi
where neti is the activation value of the ith node, yi is the output of i
the hidden layer, and fH is called the activation function of a node,
The final output of adaptive neuro-fuzzy inference system is
usually a sigmoid function as follows:
expressed as:
1 w1 w2
fH (x) = (3) fout = w̄1 f1 + w̄2 f2 = f1 + f2 = (w̄1 x)p1 + (w̄1 y)q1
1 + exp(−x) w1 + w2 w1 + w2
+(w̄1 )r1 + (w̄2 x)p2 + (w̄2 y)q2 + (w̄2 )r2
(2) Output stage: The outputs of all neurons in the output layer are
(10)
given as follows:
n
zl = fl wil yi l = 1, 2, ..., t (4) 2.3. Seasonal autoregressive integrated moving average
i=0
The SARIMA method, which is one of the most popular time
where fl (l = 1, 2, ..., t) is the activation function, which is usually
series forecasting methods, was introduced by Box and Jenk-
defined as a linear function. All weights are assigned with random
values initially, and are modified by the delta rule according to the
ins [45]. Generally speaking, it is assumed that the time series
Zt |t = 1, 2, ..., k has mean zero. A non-seasonal ARIMA model
learning samples traditionally.
of order (denoted by ARIMA) representing the time series can be
expressed as:
2.2. Adaptive network based fuzzy inference system
Zt = ϕZt−1 + ϕ2 Zt−2 + ... + ϕp Zt−p + a1 − 1 at−1
The Adaptive Network Based Fuzzy Inference System (ANFIS) − 2 at−2 − ... − q at−q (11)
which integrates the merits of fuzzy systems and neural networks
were proposed by Jang [43]. Using the neural network learning or
mechanism, it automatically extracts rules from input and out-
ϕ(B)∇ d Zt = (B)at (12)
put sample data, and thus constitutes a self-adaptive neural fuzzy
controller. Through an off-line training and the online learning where Zt and at are the actual value and random error at time,
algorithm, it creates a fuzzy reasoning and controls the self-tuning respectively; ϕt and t are the coefficients, p is the order of
of the rules, which makes the system itself move in a self-adaptive, autoregressive, q is the order of moving average polynomials, B
self-organizing and self-learning direction [44]. In Fig. 2, the ANFIS denotes the backward shift operator, d = (1 − B)d , d is the order
architecture is shown. From Fig. 2, it can be see that the ANFIS of regular differences, and ϕ(B) and (B) are defined as ϕ(B) =
network comprises five layers. Each layer contains several nodes 1 − ϕ1 B − ϕ2 B2 − ... − ϕp Bp and (B) = 1 − 1 B − 2 B2 − ... − q Bq
described by the node function. respectively. Particularly, it is assumed that at is the independent
Layer 1 and identically distributed normal random variables with mean
The first layer of this architecture is the fuzzy layer. Each node zero and variance 2 , and the roots of ϕ(Z) = 0 and (Z) = 0 all
of this layer produces the membership grade of a fuzzy set. Every lie outside the unit circle [46]. In the same way, a seasonal model
node in this layer is an adaptive node with a node function. (using the second expression) can be written as:
O1i = Ai (x) (5) ϕ(B)˚(BS )∇ d (1 − BS )Zt = (B)(Bs )at (13)
Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675 667
means each day had 48 observations and 336 values that were the
observations for the whole week which were aimed to forecast.
In order to forecast easily and take advantage of the similar prop-
erties, the data pre-analysis demarcates the initial data according to
the week into seven groups. All the Mondays were put into the first
subset, all the Tuesdays were put into the second subset and the
Fig. 3. Flowchart of the combined method.
corollary. Training dataset and testing dataset selection are shown
in Fig. 4. From Fig. 4, it can be seen that 1 to k − 1 from the first sub-
MAPE is the criteria which is to assess the accuracy of the result. set can be chosen to model fitting and training, 2 to k from the same
These metrics can be calculated as follows: subset can be chosen to forecast by using the constructed model.
After several trainings, the model became stable enough to adjust
1 2
N to different factors and cases. For example: the electrical power
RMSE = xi − x̂i (22) data from the first Monday in 2nd May 2011 to the last Monday
N
i=1 before July 3, 2011 were put in the first subset. Using all the first
Monday to forecast, the next Monday could be attained. After sev-
1
N eral experiments and simulations, the final results were obtained.
MAE = |xi − x̂i | (23) In this paper, the whole 8 weeks electricity demand data was used
N
i=1 to forecast the following week on the same day.
The following sections mainly analysis the four methods. By
1 xi − x̂i
N
simulating all the methods with the same dataset, the results can be
MAPE =
N
x × 100% (24) compared. Sections 5.2, 5.3 and 5.4 display the forecasting results
i
i=1 of BP, ANFIS and diff-SARIMA respectively. Section 5.5 states the
weight coefficients of the three individual methods. Finally Section
where, xi represents the actual value and x̂i represents the fore-
5.6 displays the comparisons between the combined method and
casted value. The results indicate that the smaller they are, the more
the four individual methods.
accurate the method will be.
Table 1
Three statistical metrics of BP in a week.
Table 2
Three statistical metrics of ANFIS in a week.
5.3. The forecasting results of ANFIS method. All the results of MAPE were not more than 4% and also
not less than 1%, and the best result was 1.999%.
Fig. 6 Part A presents the three typical results of ANFIS. From
bar graphs it can be seen that ANFIS had comparatively high val-
ues of MAPE on Monday, Wednesday, and Saturday and Sunday,
the highest value was almost 4%, however, the rest of days had 5.4. The forecasting results of diff-SARIMA
comparatively low values which were under 2.1%. Meanwhile, the
whole figures in Fig. 6 Part A show that the results of ANFIS fore- From Fig. 7 Part A, the value of MAPE on Tuesday was the lowest,
casting are average, neither too high nor too low. In order to display and the other days were almost the same. The curves in Fig. 7 Part B
the forecasting details of ANFIS, the electricity demand values of 48 look smooth and there exists a slight difference between the actual
observations on Wednesday are presented in Fig. 6 Part B. At the data and forecasting data. It can be seen that the forecasting values
time of 0:00–4:00 and 16:00–18:00, the results attained from ANFIS were always higher than the actual values on Wednesday. Three
were close to the actual values, while at other times bigger errors typical results of diff-SARIMA are displayed in Table 3 and the result
always exist. Table 2 shows the real results that measure the ANFIS of MAPE was 1.545% on Tuesday and over 3% on the other days.
670 Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675
Table 3
Three statistical metrics of diff-SARIMA in a week.
5.5. The weight coefficients of the combined method different affect on the combined method. That demonstrates the
combined method is superior to the three individual methods.
The method being proposed is a novel method which combined
three different methods. All the three methods’ weight coefficients
5.6. Comparisons between the combined method and other
in a week are presented in Table 4. From Table 4, it can be seen
individual methods
that each day had disparate weight coefficients. Different weight
coefficients mean different effect degrees, the higher the weight
In this section, for the sake of certifying the forecasting ability of
coefficients are, the greater influence it will be. For example, on
the combined method, it was contrasted with three other individual
Saturday, weight coefficient of ANFIS was bigger than the other two
methods (BP, ANFIS and diff-SARIMA). Fig. 8 shows the prediction
methods which means the ANFIS affected the combined method on
values for each day of all the four methods and each color stands for
that day. Seen from Table 4, this combined method took advantage
different methods. As seen from Fig. 8, the curve of the combined
of the three individual methods. Each method on each day had a
method was more consistent with the actual data, which means the
combined method outperformed the individual methods. However,
Table 4 among the three individual methods, it can be roughly seen that
The weight coefficients of the combined method. the BP curve was less consistent with the actual data on Monday,
Thursday and Friday while the ANFIS and diff-SARIMA curves were
BP ANFIS diff-SARIMA
more consistent with the actual data on Tuesday, Wednesday and
Monday 0.393123 0.33988 0.343686 Saturday. However, the diff-SARIMA curve was less consistent with
Tuesday 0.378311 0.239307 0.384519
Wednesday 0.236525 0.347067 0.394573
the actual data on Monday, Thursday and Friday and the difference
Thursday 0.073223 0.0346 0.871758 between the actual data and forecasting data was bigger.
Friday 0.326001 0.500754 0.13219 Table 5 shows the calculated values of three statistical metrics
Saturday 0.009652 0.835712 0.150481 and the average values of the four methods. More detailed infor-
Sunday 0.456883 0.423969 0.073735
mation is revealed in Table 6 that consists of values of the four
Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675 671
Table 5
Three statistical critics of four methods.
Monday 1012.603 540.778 411.768 431.770 947.827 441.194 315.377 334.757 9.601 4.741 3.122 3.275
Tuesday 334.641 169.799 217.696 99.038 294.073 147.019 195.045 82.372 3.020 1.545 2.061 0.827
Wednesday 340.268 422.462 352.229 157.782 307.035 411.315 299.003 136.309 2.947 4.051 2.942 1.400
Thursday 856.592 384.332 296.770 162.265 802.427 346.598 230.136 122.167 8.898 3.418 2.358 1.242
Friday 1095.342 414.669 264.709 263.800 1055.578 361.705 196.388 219.549 9.711 3.543 1.999 2.226
Saturday 164.221 506.207 372.587 170.185 139.650 486.873 340.176 139.984 1.597 5.208 3.790 1.639
Sunday 542.335 549.440 363.579 110.335 504.072 530.324 335.067 81.593 5.601 5.922 3.857 0.969
Whole week 565.515 426.812 325.620 199.311 528.356 389.290 273.027 159.533 5.315 4.061 2.876 1.654
Table 6
Actual and forecasting values forecasted by four methods on Wednesday.
Time (h) Actual Forecasted value(MW) of Time (h) Actual Forecasted value(MW) of
value(MW) value(MW)
0:00 9075.54 9075.229 9396.926 9388.197 9078.028 12:00 9835.35 10237.26 10259.48 10071 9983.172
0:30 8852.95 8860.05 9184.872 9124.196 8857.232 12:30 9810.23 10211.02 10248.29 9983.349 9948.918
1:00 8711.13 8657.702 8954.259 8961.677 8657.571 13:00 9793.52 10151.83 10234.31 10011.44 9929.5
1:30 8508.96 8356.377 8676.238 8577.358 8352.39 13:30 9756.38 10137.14 10229.14 9969.884 9912.536
2:00 8144.21 7971.671 8306.902 8100.33 7960.313 14:00 9743.84 10139.62 10239.42 9925.841 9907.031
2:30 7773.2 7600.422 7962.122 7779.599 7619.563 14:30 9766.08 10099.98 10176.89 9945.684 9873.298
3:00 7498.6 7336.428 7738.948 7520.544 7378.608 15:00 9808.06 10128.32 10224.44 9964.103 9906.252
3:30 7306.29 7202.014 7575.665 7421.331 7244.063 15:30 9960.08 10297.97 10389.67 10073.45 10056.19
4:00 7228.83 7145.443 7542.275 7418.594 7210.608 16:00 10236.11 10582.18 10616 10328.09 10304.36
4:30 7288.78 7274.426 7630.468 7475.585 7303.651 16:30 10685.9 11068.91 11017.19 10802.27 10743.75
5:00 7555.85 7545.053 7956.35 7721.507 7584.328 17:00 11297.21 11791.95 11724.87 11311.4 11394.34
5:30 8025.08 8081.051 8393.846 8307.872 8081.669 17:30 11713.4 12218.42 12237.19 11335.49 11750.2
6:00 8844.56 9024.524 9264.007 9481.52 9030.057 18:00 11705.29 12168.57 12155.95 11333.26 11700.32
6:30 9643.28 9912.546 10084.56 10190.35 9829.684 18:30 11569.77 12011.77 12008.51 11336.38 11588.46
7:00 10058.81 10427.97 10515.63 10637.54 10284.43 19:00 11285.7 11723.37 11736.86 11338.23 11381.62
7:30 10470.08 10863.09 10900.37 11049.03 10684.58 19:30 11051.7 11509.57 11559.16 11324.44 11234.04
8:00 10554.39 10928.77 10953.98 11070.16 10733.53 20:00 10853.03 11285.09 11373.49 11297.21 11076.43
8:30 10435.16 10781.53 10838.19 10877.77 10591.23 20:30 10585.16 10996.06 11141.26 11207.94 10863.37
9:00 10406.96 10792.66 10831.42 10781.92 10569.75 21:00 10252.17 10700.97 10819.81 10950.77 10573.29
9:30 10344.68 10771.34 10787.81 10655.88 10515.33 21:30 9887.65 10329.16 10450.92 10518.93 10196.55
10:00 10279.92 10690.77 10715.47 10554.61 10434.87 22:00 9862.22 10269.69 10378.09 10396.27 10118.16
10:30 10151.66 10590.24 10627.21 10405.53 10329.9 22:30 9544.33 9867.452 10066.39 10045.9 9772.699
11:00 10009.7 10397.01 10486.91 10285.42 10179.07 23:00 9370.54 9626.381 9834.738 9837.936 9548.438
11:30 9912.55 10269.94 10325.88 10207.75 10053.06 23:30 9214.94 9445.299 9640.604 9664.001 9367.85
methods on Wednesday. From Table 5, it can be seen that each bined method. In general, taking into account the average values
method has the best accuracy at some specific time point. When of the whole week, the combined method had lower values than
making a comparison between the three separate methods and the the three individual methods. That is to say the combined method
combined method, the results were as follows: outperformed the other methods.
Combined method vs. BP: three parameters RMSE, MAE and In order to contrast the four methods in detail, 48 observations
MAPE of the combined method had lower values than BP except on on Wednesday were taken as an illustration. From Fig. 9, it can
Saturday. Though MAPE of the combined method was a bit higher be seen that BP was as effective as the combined method before
than BP, but other days’ were lower. In general, taking the average 9:00, which had a higher accuracy. While after 9:00, the differences
value into account, the combined method performed better than between actual data and forecasting data became greater. Although
BP. The combined method had reduced RMSE by 64.8%, MAE by ANFIS outperformed better between 15:00 and 18:00, it couldn’t
69.8% and MAPE by 68.9%. match the combined method in the whole day. On Wednesday,
Combined method vs. diff-SARIMA: three parameters RMSE, the forecasting results of diff-SARIMA were always lower than the
MAE and MAPE of the combined method had lower values every actual data. That indicates diff-SARIMA cannot predict well on that
day than diff-SARIMA. That means diff-SARIMA was not as good as day. When taking a look at the results of the combined method, the
the combined method. The combined method had reduced RMSE two curves on Wednesday were almost becoming the same curve.
by 53.3%, MAE by 59.0% and MAPE by 59.3%. This totally demonstrated that the combined method was superior
Combined method vs. ANFIS: three parameters RMSE, MAE and to the other three individual ones.
MAPE of the combined method had lower values than ANFIS except Furthermore, to display the accuracy of every separate method
on Monday. Though on Monday, MAE and MAPE of ANFIS were vividly, Fig. 10 shows the errors which were the difference between
lower than the combined method. In terms of the average values, forecasting values and actual values of all the four methods every-
ANFIS was not as good as the combined method that had reduced day. As shown in Fig. 10, the combined method had the smallest
RMSE by 38.8%, MAE by 41.7% and MAPE by 42.5%. errors that were close to zero for the whole week. Meanwhile,
All in all, BP performed better on Saturday than the combined the other three single methods had the smaller errors merely on
method and ANFIS was better on Monday and Friday than the com- some day and had bigger errors most days. Taking Wednesday as
Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675 673
an example: errors of the combined method were almost zero and forecasting will considerably decrease electricity consumption. The
the combined method curve was as smooth as a line, which means combined method takes advantage of each single method, which
the forecasting values of the combined method were close to the makes the combined method superior to other separate methods.
actual values. Errors of diff-SARIMA were almost a line, but all the As a result, the combined method possesses high accuracy and sta-
errors were above zero. The curve of ANFIS obviously fluctuated up bility. Meanwhile, through using the DE optimization algorithm,
and down. All in all, the combined method had the smallest errors the proposed method can be significantly optimal. MAPE of the
among all the methods. That demonstrates the combined method combined method was 1.654% that was much lower than BP, ANFIS
was superior to the three individual methods. and diff-SARIMA. In a nutshell, the aforementioned comparison
Because forecasting errors will increase the burden to electri- results confirm that the combined method outperformed the other
cal production costs, increasing the accuracy of electricity demand three individual methods.
674 Y. Yang et al. / Applied Soft Computing 49 (2016) 663–675
[40] A. Azadeh, S.F. Ghaderi, S. Sohrabkhani, Annual electricity consumption [49] A.A.A. El Ela, M.A. Abido, S.R. Spea, Optimal power flow using differential
forecasting by neural network in high energy consuming industrial sectors, evolution algorithm, Electr. Eng. 91 (2) (2009) 69–78.
Energy Convers. Manage. 49 (8) (2008) 2272–2278. [50] B.V. Babu, S.A. Munawar, Differential evolution strategies for optimal design
[41] S. Haykin, N. Network, A comprehensive foundation, Neural Netw. 2004 of shell-and-tube heat exchangers, Chem. Eng. Sci. 62 (14) (2007) 3720–3739.
(2004) 2. [51] M.H. Khademi, M.R. Rahimpour, A. Jahanmiri, Differential evolution (DE)
[42] D.Ö. Faruk, A hybrid neural network and ARIMA model for water quality time strategy for optimization of hydrogen production, cyclohexane
series prediction, Eng. Appl. Artif. Intell. 23 (4) (2010) 586–594. dehydrogenation and methanol synthesis in a hydrogen-permselective
[43] J.S.R. Jang, Self-learning fuzzy controllers based on temporal membrane thermally coupled reactor, Int. J. Hydrogen Energy 35 (5) (2010)
backpropagation[, IEEE Trans. Neural Netw. 3 (5) (1992) 714–723. 1936–1950.
[44] M. Liu, Y.Y. Ling, Using fuzzy neural network approach to estimate [52] J.S. Armstrong, Combining forecasts: the end of the beginning or the
contractors’ markup, Build. Environ. 38 (11) (2003) 1303–1308. beginning of the end? Int. J. Forecast. 5 (4) (1989) 585–588.
[45] G.E.P. Box, G.M. Jenkins, G.C.et al. Reinsel, Time Series Analysis: Forecasting [53] C. Christodoulos, C. Michalakelis, D. Varoutas, Forecasting with limited data:
and Control, John Wiley & Sons, 2015. combining ARIMA and diffusion models, Technol. Forecast. Soc. Change 77 (4)
[46] F.M. Tseng, H.C. Yu, G.H. Tzeng, Combining neural network model with (2010) 558–565.
seasonal time series ARIMA model, Technol. Forecast. Soc. Change 69 (1) [54] S. Gupta, P.C. Wilton, Combination of forecasts: an extension, Manage. Sci. 33
(2002) 71–87. (3) (1987) 356–372.
[47] R. Storn, K. Price, Differential Evolution-a Simple and Efficient Adaptive [55] Y. Chen, Y. Yang, C. Liu, et al., A hybrid application algorithm based on the
Scheme for Global Optimization over Continuous Spaces, ICSI, Berkeley, 1995. support vector machine and artificial intelligence: an example of electric load
[48] D.Y. Sha, C.Y. Hsu, A new particle swarm optimization for the open shop forecasting, Appl. Math. Model. 39 (9) (2015) 2617–2632.
scheduling problem, Comput. Oper. Res. 35 (10) (2008) 3243–3261.