Jordy Ariel Loor Vélez
Jordy Ariel Loor Vélez
Abstract
Smart home energy management systems help the distribution grid operate more efficiently and reliably,
and enable effective penetration of distributed renewable energy sources. These systems rely on robust
forecasting, optimization, and control/scheduling algorithms that can handle the uncertain nature of demand
and renewable generation. This paper proposes an advanced ML algorithm, called Recurrent Trend Predictive
Neural Network based Forecast Embedded Scheduling (rTPNN-FES), to provide efficient residential demand
control. rTPNN-FES is a novel neural network architecture that simultaneously forecasts renewable energy
generation and schedules household appliances. By its embedded structure, rTPNN-FES eliminates the
utilization of separate algorithms for forecasting and scheduling and generates a schedule that is robust against
forecasting errors. This paper also evaluates the performance of the proposed algorithm for an IoT-enabled
smart home. The evaluation results reveal that rTPNN-FES provides near-optimal scheduling 37.5 times
faster than the optimization while outperforming state-of-the-art forecasting techniques.
Keywords: energy management, forecasting, scheduling, neural networks, recurrent trend predictive neural
network
build energy management systems. Moreover, in a consumption per slot denoted by En . In addition,
recent work [47], Nakıp et al. mimicked the schedul- n should be active uninterruptedly for an successive
ing via ANN and developed an energy management slots. That is, when n is started, it consumes an En
system using this ANN-based scheduling. However, until it stops. Moreover, we assume that the con-
in contrast with rTPNN-FES proposed in this paper, sidered renewable energy system contains a battery
none of them has used ANN to generate scheduling with a capacity of Bmax , where the stored energy in
or combined forecasting and scheduling in a single this battery is used via an inverter with a supply limit
neural network architecture. of Θ. We assume that there is enough energy in total
(the sum of the stored energy in the battery and total
3. System Setup and Optimization Problem generation) to supply all devices within [0, H].
At the beginning of the scheduling window, we
In this section, we present the assumptions, math- forecast the renewable energy generation and sched-
ematical definitions and the optimization problem re- ule the devices accordingly. To this end, as the main
lated to the system setup which is used for embedded contribution of this paper, we combine the forecaster
forecasting scheduling via rTPNN-FES and shown in and scheduler in a single neural network architecture,
Figure 1. During this paper, rTPNN-FES is assumed called rTPNN-FES, which shall be presented in Sec-
to perform at the beginning of a scheduling window tion 4.
that consists of equal-length S slots and has a total Optimization Problem: We now define the opti-
duration of H in actual time (i.e. the horizon length). mization problem for the non-preemptive scheduling
In addition, the length of each slot s equals H/S , and of the starting slots of devices to minimize user dis-
the actual time instance at which the slot s starts is satisfaction. In other words, this optimization prob-
denoted by m s . Then, we let gms denote the power lem aims to distribute the energy consumption over
generation by the renewable energy source within slots prioritizing “user satisfaction”, assuming that
slot s. Also, ĝms denotes the forecast of gms . the operation of each device is uninterruptible. In
We let N be the set of devices that need to be this article, we consider a completely off-grid sys-
scheduled until H (in other words until the end of tem –which utilizes only renewable energy sources–
slot S ), and N denote the total number of devices, i.e. where it is crucial to achieve near-optimal scheduling
|N| = N. Each device n ∈ N has a constant power
5
to use limited available resources. Recall that this The objective function (1) minimizes the total
optimization problem is re-solved at the beginning user dissatisfaction cost over all devices as
( n∈N Ss=1 x(n,s)
∗
P P
of each scheduling window for the available set of c(n,s) ). While minimizing user
devices N using the forecast generation ĝms over the dissatisfaction, the optimization problem also
scheduling window in Figure 1. considers the following constraints:
Moreover, for each n ∈ N, there is a predefined
cost of user dissatisfaction, denoted by c(n,s) , for • Uniqueness and Operation constraint in (2)
scheduling the start of n at slot s. This cost can take ensures that each device n is scheduled to start
value in the range of [0, +∞), and c(n,s) set to +∞ if exactly at a single slot between 1-st and [S −
the user does not want slot s to be reserved for device (an −1)]-th slot. The upper limit for the starting
n. As we shall explain in more detail in Section 5, of the operation of device n is set to [S − (an −
we determine the user dissatisfaction cost c(n,s) as 1)] because n must operate for successive an
the increasing function of the distance between s slots before the end of the last slot S .
and the desired start time of the considered device • Inverter Limitation constraint in (3) limits
n. We should note that the definition of the user total power consumption at each slot s
dissatisfaction cost only affects the numerical results to the maximum power of Θ that can be
since the proposed rTPNN-FES methodology does provided by the inverter. Note that the term
not depend on its definition. Ps ∗
s′ =s−(an −1) x(n,s′ ) is a convolution which equals
Then, we let x(n,s) denote a binary schedule for 1 if device n is scheduled to be active at slot s
the start of the activity of device n at slot s. That (i.e. n is scheduled to start between s − (an − 1)
is, x(n,s) = 1 if device n is scheduled to start at the and s).
beginning of slot s, and x(n,s) = 0 otherwise. In
addition, in our optimization program, we let x(n,s)∗
• Maximum Storage constraint in (4) ensures
be a binary decision variable and denote the optimal that the scheduled consumption at each slot s
value of x(n,s) . Accordingly, we define the optimiza- does not exceed the sum of the predicted gen-
tion problem as follows: eration (ĝms ) at this slot and the maximum en-
ergy (Bmax ) that can be stored in the battery.
S
XX
∗
min x(n,s) c(n,s) (1) • Total Consumption constraint in (5) ensures
n∈N s=1
that the scheduled total power consumption
subject to until each slot s is not greater than the
summation of the stored energy, B, at the
S −(a
X n −1) beginning of the scheduling window and the
∗
x(n,s) = 1, ∀n ∈ N (2) total generation until s. This constraint is used
s=1 as we are considering a completely off-grid
s
X X system.
′ ) ≤ Θ, ∀s ∈ {1, . . . , S } (3)
∗
En x(n,s
n∈N s′ =[s−(an −1)]+
X X s 4. Recurrent Trend Predictive Neural Network
∗
En x(n,s′) ≤ ĝ ms
+ Bmax , (4) based Forecast Embedded Scheduling
n∈Ni s′ =[s−(an −1)]+ (rTPNN-FES)
∀s ∈ {1, . . . , S }
In this section, we present our rTPNN-FES
s s′ s
XX X X neural network architecture. Figure 2 displays the
∗
En x(n,s′′ ) ≤ B+ ĝms′ , (5)
architectural design of rTPNN-FES which aims
n∈N s′ =1 s′′ =[s′ −(an −1)]+ s′ =1
to generate scheduling for the considered window
∀s ∈ {1, . . . , S }
while forecasting the power generation through this
where [Ξ]+ = Ξ if Ξ ≥ 1; otherwise, [Ξ]+ = 1. window automatically and simultaneously. To this
6
Figure 2: Recurrent Trend Predictive Neural Network based Forecast Embedded Scheduling (rTPNN-FES)
end, rTPNN-FES is comprised of two main layers of has observed that the feature f has periodicity; τ0
“Forecasting Layer” and “Scheduling Layer”, and it represents the periodicity duration for gms . Note
is trained using the “2-Stage Training Procedure”. that we do not assume that the features will have a
We let F be the set of features and periodic nature. If there is no observed periodicity,
F ≡ {1, . . . , F}. In addition, zmf s denotes the value of τ f can be set to H.
input feature f in slot s which starts at m s , where this As shown in Figure 2, the inputs of rTPNN-FES
m −2τ m −τ
feature can be considered as any external data , such are {gms −2τ0 , gms −τ0 } and {z f s f , z f s f } for f ∈ F ,
as weather predictions, that are directly or indirectly s∈{1,...,S }
and the output of that is {xn,s }n∈{1,...,N} .
related to power generation gms . We also let τ f
be a duration of time when the system developer
7
4.1. Forecasting Layer denoted by DP0 or for each time series feature f ,
Forecasting Layer is responsible for forecasting denoted by DP f . That is, DP f for any feature f (in-
the power generation within the architecture of cluding f = 0) has the same structure but its corre-
rTPNN-FES. For each slot s in the scheduling sponding input is different for each f . For example,
m −2τ m −τ
window, rTPNN-FES forecasts the renewable energy the input of DP f is {z f s f , z f s f } corresponding to
generation ĝms based on the collection of the past any time series feature f ∈ {1, . . . , F} while the in-
m −2τ m −τ
feature values for two periods, {z f s f , z f s f } f ∈F , put of DP0 is the past values of energy generation
as well as the past generation for two periods {gms −2τ0 , gms −τ0 }. Thus, one may notice that DP0 is
{gms −2τ0 , gms −τ0 }. To this end, this layer consists the only unit with a special input.
of S parallel rTPNN models that share the same During the explanation of the DP unit, we focus
parameter set (connection weights and biases). on a particular instance DP f , which is also shown in
m −2τ m −τ
That is, in this layer, there are S replicas of a detail in Figure 3. Using {z f s f , z f s f } input pair,
single trained rTPNN; in other words, one may say DP f aims to learn the relationship between this pair
that a single rTPNN is used with different inputs and each of the predicted trend t sf and the predicted
to forecast the traffic generation for each slot s. level l sf . To this end, DP f consists of Trend Predic-
Therefore, all but one of the Trained rTPNN blocks tor and Level Predictor sub-units each of which is a
are shown as transparent in Figure 2. linear recurrent neuron.
The weight sharing among rTPNN models (i.e. As shown in Figure 3, Trend Predictor of DP f
using replicated rTPNNs) has the following advan- computes the weighted sum of the change in the
tages: value of feature f from m s − 2τ f to m s − τ f and
the previous value of the predicted trend. That is,
• The number of parameters in the Forecasting DP f calculates the sum of the difference between
Layer decreases by a factor of S ; thus reducing m −τ m −2τ
(z f s f − z f s f ) with connection weight of α1f and
both time and space complexity.
the previous value of the predicted trend t s−1 f with
• By avoiding rTPNN training repeated S times, the connection weight of α f as 2
• Because a single rTPNN is trained on the data By calculating the trend of a feature and learning
collected over S different slots, the rTPNN can the parameters in (6), rTPNN is able to capture be-
now capture recurrent trends and relationships havioural changes over time, particularly those re-
with higher generalization ability. lated to the forecasting of ĝms .
Level Predictor sub-unit of DP f predicts the level
4.1.1. Structure of rTPNN of feature value, which is the smoothed version of the
m −τ
We now briefly explain the structure of rTPNN, value of feature f , using only z f s f and the previous
which has been originally proposed in [1], for our state of the predicted level l s−1f . To this end, it com-
m s −τ f
rTPNN-FES neural network architecture. As shown putes the sum of the z f and l s−1
f with weights of
in Figure 3 displaying the structure of rTPNN, for β f and β f respectively as
1 2
any s, the inputs of rTPNN are {gms −2τ0 , gms −τ0 } and
m −2τ m −τ
{z f s f , z f s f } for f ∈ F , and the output is ĝms . In m −τ f
l sf = β1f z f s + β2f l s−1
f (7)
addition, the rTPNN architecture consists of (F + 1)
Data Processing (DP) units and L fully connected By predicting the level, we can reduce the effects
layers, including the output layer. on the forecasting of any anomalous instantaneous
changes in the measurement of any other feature f .
4.1.2. DP units Note that parameters α1f , α2f , β1f and β2f of Trend
In the architecture of rTPNN, there is one DP Predictor and Level Predictor sub-units are learned
unit either for the past values of energy generation,
8
Trend Predictor
Level Predictor
-
+
Trend Predictor
Level Predictor
during the rTPNN training like all other parameters 2. Level Predictors of DP0 -DPF :
(i.e. connection weights).
l0s = β10 gms −τ0 + β20 l0s−1 ,
4.1.3. Feed-forward of rTPNN l sf = β1f zmf s −τ0 + β2f l s−1
f , ∀f ∈ F (9)
We now describe the calculations performed dur-
ing the execution of the rTPNN; that is, when making 3. Concatenation of the outputs of DP0 -DPF to
a prediction via rTPNN. To this end, first, let Wl feed to the hidden layers:
denote the connection weight matrix for the inputs
of hidden layer l, and bl denote the vector of biases z s = [t0s , l0s , gms −τ0 , . . . , tFs , lFs , zmF s −τF ] (10)
of l. Thus, for each s, the forward pass of rTPNN is
as follows: 4. Hidden Layers from l = 1 to l = L:
1. Trend Predictors of DP0 -DPF :
O1s = Ψ(W1 (z s )T + b1 ), (11)
t0s = α10 (gms −τ0 − gms −2τ0 ) + α20 t0s−1 , Ol = Ψ(Wl Ol−1 + bl ), ∀l ∈ {2, . . . , L − 1} (12)
s s
m −τ f m −2τ f
t sf = α1f (z f s − zf s ) + α2f t s−1
f , ∀f ∈ F ĝms = Ψ(WL OL−1
s
+ bL ), (13)
(8)
where (z s )T is the transpose of the input vector z s ,
9
Ols is the output vector of hidden layer l, and Ψ(·) 4.3. 2-Stage Training Procedure
denotes the activation function as an element-wise We train our rTPNN-FES architecture to learn the
operator. optimal scheduling of devices as well as the fore-
casting of energy generation in a single neural net-
4.2. Scheduling Layer
work. To this end, we first assume that there is a col-
The Scheduling Layer consists of N parallel soft-
lected dataset comprised of the actual values of gms
max layers, each responsible for generating a sched-
and {zmf s } f ∈F for s ∈ {1, . . . , S } for multiple schedul-
ule for a single device’s start time. A single softmax
ing windows. Note that rTPNN-FES does not depend
layer for device n is shown in Figure 4. Since this
on the developed 2-stage training procedure, so it
layer is cascaded behind the Forecasting Layer, each
can be used with any training algorithm. For each
device n is scheduled to be started at each slot s based
window in this dataset, the 2-stage procedure works
on the output of the Forecasting Layer ĝms as well as
as follows:
the system parameters c(n,s) , En , B, Bmax and Θ for
this device n and this slot s. 4.3.1. Stage 1 - Training of rTPNN Separately for
Forecasting
In this first stage of training, in order to create
a forecaster, the rTPNN model (Figure 3) is trained
separately from the rTPNN-FES architecture (Fig-
ure 2). To this end, the deviation of ĝms from gms for
s ∈ {1, . . . , S }, i.e. the forecasting error of rTPNN, is
measured via Mean Squared Error as
S
1 X ms
MS Eforecast ≡ (g − ĝms )2 (16)
S s=1
Figure 4: The structure of Scheduling Layer
We update the parameters (connection weights and
In Figure 4, each arrow represents a connection biases) of rTPNN via back-propagation with gradient
weight. Accordingly, for device n for slot s in a descent, in particular the Adam algorithm, to min-
softmax layer of the Scheduling Layer, a neuron first imize MS Eforecast , where the initial parameters are
calculates the weighted sum of the inputs as set to parameters found in previous training. We re-
peat updating parameters as many epochs as required
B
α(n,s) = wg(n,s) gms + w(n,s)
B
− wc(n,s) c(n,s) (14) without over-fitting to the training samples.
S When Stage 1 is completed, the parameters of
E
−w(n,s) En − wΘ(n,s) Θ − w(n,s)
Bmax
Bmax “Trained rTPNN” in Figure 2 are replaced by the
resulting parameters found in this stage. Then, the
where all connection weights of wg(n,s) , w(n,s) B
, wc(n,s) ,
parameters of Trained rTPNN are frozen to continue
E
w(n,s) , wΘ(n,s) , and w(n,s)
B max
are strictly positive. In ad- further training of rTPNN-FES in Stage 2. That is,
dition, the signs of the terms are determined con- the parameters of Trained rTPNN are not updated in
sidering the intuitive effect of the parameter on the Stage 2.
schedule decision for device n at slot s. For exam-
ple, the higher gms makes slot s a better candidate to 4.3.2. Stage 2 - Training of rTPNN-FES for Schedul-
schedule n, while the higher user dissatisfaction cost ing
c(n,s) makes slot s a worse candidate. In addition, In Stage 2 of training, in order to create a
a softmax activation is applied at the output of this scheduler emulating optimization, the rTPNN-FES
neuron: architecture (Figure 2) is trained following the steps
eα(n,s) shown in Figure 5.
x(n,s) = Φ(α(n,s) ) = PS (15)
s=1 α(n,s)
10
Figure 5: The steps in Stage 2 training of rTPNN-FES to learn to schedule
Appliance Name Power Consumption (kW) Active Duration Desired Start Time
Washing Machine (warm wash) 2.3 2 14
Dryer (avg. load) 3 2 16 (earliest 15)
Robot Vacuum Cleaner 0.007 2 15
Iron 1.08 2 8
TV 0.15 3 20
*Refrigerator 0.083 24 non-stop
Oven 2.3 1 18
Dishwasher 2 2 21
Electric Water Heater 0.7 1 6, 17
Central AC 3 2 6, 18
Pool Filter Pump 1.12 8 10
Electric Vehicle Charger 7.7 8 21 (earliest 18 latest 23)
4000
4
Actual
rTPNN 3000
LSTM
3
MLP
2000
2
1000
1 0
-6 -4 -2 0 2 4
0
0 10 20 30 40 50 Figure 7: Histogram of the forecasting error in kW measured as
(ĝms − gms ) for each m s in the test set
Figure 6: Forecasting results of the three most competitive
models (rTPNN, LSTM and MLP) with respect to results in
samples (around 5000 out of 8664 samples). We
Table 2 for the time between fifth and seventh days in the test
set also see that the absolute error is smaller than 2 for
93% of the samples. We also see that the overall
Next, in Figure 6, we present the actual energy forecasting error is lower for rTPNN than both
generation between the fifth and the seventh days of LSTM and MLP.
the test set as well as those forecast by the best three
5.3. Scheduling Performance of rTPNN-FES
techniques (rTPNN, LSTM and MLP). Our results
show that the predictions of rTPNN are the closest to We now evaluate the scheduling performance of
the actual generation within the predictions of these rTPNN-FES for the considered smart home energy
three techniques. In addition, we see that rTPNN can management system. To this end, we compare the
successfully capture both increases and decreases in schedule generated by rTPNN-FES with that by opti-
energy generation while LSTM and MLP struggle to mization (solving (1)-(5)) using actual energy gener-
predict sharp increases and decreases. ations as well as the GA-based scheduling (presented
Finally, Figure 7 displays the histogram of the in Section 5.1.4). Note that although the schedule
forecasting error that is realized by each of rTPNN, generated by the optimization using actual genera-
LSTM, and MLP on the test set. Our results in this tions is the best achievable schedule, it is practically
figure show that the forecasting error of rTPNN is not available due to the lack of future information
around zero for the significantly large number of about the actual generations.
14
Figure 8: Comparison of rTPNN-FES against the optimal scheduling and GA-based scheduling with respect to the scheduling cost
(top) for the days of the test set and (bottom) as the boxplot of the cost difference.
Figure 8 (top) displays the comparison of and optimal scheduling is 1.3% and the maximum
rTPNN-FES against the optimal scheduling and the difference is about 3.48%.
GA-based scheduling regarding the cost value for Furthermore, Figure 8 (bottom) displays the
the days of the test set. In this figure, we see that summary of the statistics for the cost difference
rTPNN-FES significantly outperforms GA-based between rTPNN-FES and the optimal scheduling
scheduling achieving close-to-optimal cost. In as well as the difference between GA-based and
other words, the user dissatisfaction cost – which optimal scheduling as a boxplot. In Figure 8
is defined in (1) – of rTPNN-FES is significantly (bottom), we first see that the cost difference is
lower than the cost of GA-based scheduling, and it significantly lower for rTPNN-FES, where even the
is slightly higher than that of optimal scheduling. upper quartile of rTPNN-FES is smaller than the
The average cost difference between rTPNN-FES lower quartile of GA-based scheduling. We also
15
see that the median of the cost difference between rTPNN) in seconds. Note that we do not present the
rTPNN-FES and optimal scheduling is 0.13 and the computation of GA-based scheduling in this figure
upper quartile of that is about 0.146. That is, the since it takes 4.61 seconds on average – which is
cost difference is less than 0.146 for 75% of the days approximately 3 orders of magnitude higher than
in the test set. In addition, we see that there are only the computation time of rTPNN-FES and 1 order of
7 outlier days for which the cost is between 0.19 and magnitude higher than that of optimization – to find
0.3. According to the results presented in Figure 8, a schedule for a single window. Our results in this
rTPNN-FES can be considered as a successful figure show that rTPNN-FES requires significantly
heuristic with a low increase in cost. lower computation time than optimization to
generate a daily schedule of household appliances.
5.4. Evaluation of the Computation Time The average computation time of rTPNN-FES is
In Table 3, we present measurements on the about 4 ms while that of optimization with LSTM is
training and execution times of each forecasting 150 ms. That is, rTPNN-FES is 37.5 times faster
model. Our results first show that the execution than optimization with LSTM to simultaneously
time of rTPNN (0.17 ms) is comparable with the forecast and schedule. Although the absolute
execution time of LSTM and highly acceptable computation time difference seems insignificant for
for real-time applications. On the other hand, the a small use case (as in this paper), it would have
training time measurements show that the training important effects on the operation of large renewable
of rTPNN takes longer than that of other forecasting energy networks with a high number of sources and
models. Accordingly, one may say that there is a devices.
trade-off between training time and the forecasting
performance of rTPNN.
18
renewable energy integration and demand response strat-
egy, Energy 210 (2020) 118602.
[37] M. Ren, X. Liu, Z. Yang, J. Zhang, Y. Guo, Y. Jia, A
novel forecasting based scheduling method for household
energy management system based on deep reinforce-
ment learning, Sustainable Cities and Society 76 (2022)
103207.
[38] P. Lissa, C. Deane, M. Schukat, F. Seri, M. Keane,
E. Barrett, Deep reinforcement learning for home energy
management system control, Energy and AI 3 (2021)
100043.
[39] L. Yu, W. Xie, D. Xie, Y. Zou, D. Zhang, Z. Sun,
L. Zhang, Y. Zhang, T. Jiang, Deep reinforcement
learning for smart home energy management, IEEE
Internet of Things Journal 7 (4) (2020) 2751–2762.
doi:10.1109/JIOT.2019.2957289.
[40] Z. Wan, H. Li, H. He, Residential energy management
with deep reinforcement learning, in: 2018 International
Joint Conference on Neural Networks (IJCNN), 2018, pp.
1–7. doi:10.1109/IJCNN.2018.8489210.
[41] A. Mathew, A. Roy, J. Mathew, Intelligent residential
energy management system using deep reinforcement
learning, IEEE Systems Journal 14 (4) (2020) 5362–5372.
doi:10.1109/JSYST.2020.2996547.
[42] Y. Liu, D. Zhang, H. B. Gooi, Optimization strategy based
on deep reinforcement learning for home energy manage-
ment, CSEE Journal of Power and Energy Systems 6 (3)
(2020) 572–582. doi:10.17775/CSEEJPES.2019.02890.
[43] R. Lu, R. Bai, Y. Ding, M. Wei, J. Jiang, M. Sun, F. Xiao,
H.-T. Zhang, A hybrid deep learning-based online energy
management scheme for industrial microgrid, Applied
Energy 304 (2021) 117857.
[44] Y. Ji, J. Wang, J. Xu, X. Fang, H. Zhang, Real-time energy
management of a microgrid using deep reinforcement
learning, Energies 12 (12) (2019) 2291.
[45] S. Totaro, I. Boukas, A. Jonsson, B. Cornélusse, Lifelong
control of off-grid microgrid with model-based reinforce-
ment learning, Energy 232 (2021) 121035.
[46] Y. Gao, Y. Matsunami, S. Miyata, Y. Akashi, Operational
optimization for off-grid renewable building energy sys-
tem using deep reinforcement learning, Applied Energy
325 (2022) 119783.
[47] M. Nakip, A. Asut, C. Kocabıyık, C. Güzeliş, A smart
home demand response system based on artificial neural
networks augmented with constraint satisfaction heuris-
tic, in: 2021 13th International Conference on Electrical
and Electronics Engineering (ELECO), IEEE, 2021, pp.
580–584.
[48] M. Nakıp, Recurrent Trend Predictive Neural
Network - Keras Implementation (5 2022).
doi:10.5281/zenodo.6560245.
URL https://github.com/mertnakip/Recurrent-
Trend-Predictive-Neural-Network
[49] S. Katoch, S. S. Chauhan, V. Kumar, A review on genetic
algorithm: past, present, and future, Multimedia Tools
and Applications 80 (5) (2021) 8091–8126.
19