0% found this document useful (0 votes)
32 views

Kgptalkie Com Multi Step Time Series Predicting Using RNN LSTM

The document discusses using an RNN-LSTM model to predict household power consumption based on time-series data. It provides background on the problem, describes the dataset containing power consumption observations collected every minute from 2006-2010. It then explains the details of an RNN and LSTM network for time-series prediction, including the forget gate, input gate, output gate, and cell state. The document concludes by showing code to import libraries, read in the dataset, prepare the data, and begin modeling the time-series data using an LSTM network to predict future power consumption.

Uploaded by

vicky virrat
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Kgptalkie Com Multi Step Time Series Predicting Using RNN LSTM

The document discusses using an RNN-LSTM model to predict household power consumption based on time-series data. It provides background on the problem, describes the dataset containing power consumption observations collected every minute from 2006-2010. It then explains the details of an RNN and LSTM network for time-series prediction, including the forget gate, input gate, output gate, and cell state. The document concludes by showing code to import libraries, read in the dataset, prepare the data, and begin modeling the time-series data using an LSTM network to predict future power consumption.

Uploaded by

vicky virrat
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Multi-step-Time-series-predicting

using RNN LSTM


Published by berryedelson on 28 August 2020

Household Power Consumption Prediction using


RNN-LSTM
Power outage accidents will cause huge economic loss to the social economy. Therefore, it is very
important to predict power consumption.

Given the rise of smart electricity meters and the wide adoption of electricity generation technology
like solar panels, there is a wealth of electricity usage data available.
Problem Statement :
Given that power consumption data for the previous week, we have to predict the power
consumption for the next week.

Watch Full Video:

TensorFlow 2.0 Tutorial for Beginners 19 - Multi Step Pr…


Watch later Share

Download dataset:
https://archive.ics.uci.edu/ml/machine-learning-
databases/00235/household_power_consumption.zip

Details:
https://archive.ics.uci.edu/ml/datasets/individual+household+electric+power+consumption

Dataset Description:
The data was collected between December 2006 and November 2010 and observations of power
consumption within the household were collected every minute.
It is a multivariate series comprised of seven variables
global_active_power: The total active power consumed by the household (kilowatts).
global_reactive_power: The total reactive power consumed by the household (kilowatts).
voltage: Average voltage (volts).
global_intensity: Average current intensity (amps).
sub_metering_1: Active energy for kitchen (watt-hours of active energy).
sub_metering_2: Active energy for laundry (watt-hours of active energy).
sub_metering_3: Active energy for climate control systems (watt-hours of active energy).

This data represents a multivariate time series of power-related variables that in turn could be used
to model and even forecast future electricity consumption

Time-series predictions play a major role in machine learning which is often neglected.
Nonetheless, there are lots of machine learning algorithms we could use for these problems. The
major machine learning algorithms involving Statsmodels and Econometric models etc. Today we
will take a look at how to use and apply Deep learning algorithms to predict the time series Data

Why use a Deep Learning Algorithm?


With the data volume growing enormous day by day we shouldn’t con몭ne ourselves to only the
standard ML algorithms. Deep learning algorithms help us to handle large volumes of data and
without leaving the key insights and by tuning the model within the right way gives us the
maximum yield i.e., in our cause maximum accuracy . The model also determines if our
prediction is better or worse from its own neural network architecture.

For this Time series forecasting we will use Long- Short Term Memory unit
(LSTM).

Recurrent Neural Network (RNN)


To understand an LSTM Network, we need to understand a Recurrent Neural Network 몭rst. This
kind of network is used to recognize patterns when past results have in몭uence on the present
result. An example of RNN usage is the time-series functions, in which the data order is extremely
important. In this network architecture, the neuron uses as input not only the regular input (the
previous layer output), but also its previous state.
It is important to notice that H represents the neuron state. Therefore, when in state H_1 , the
neuron uses as input the parameter X_1 and H_0 (its previous state). The main problem of this
model is the memory loss. The network older states are fast forgotten. In sequences where we
need to remember beyond the immediate past, RNNs fail to remember.

Long Short Term Memory unit(LSTM) was typically created to overcome the limitations of a
Recurrent neural network (RNN). The Typical long data sets of Time series can actually be a time-
consuming process which could typically slow down the training time of RNN architecture. We
could restrict the data volume but this a loss of information. And in any time-series data sets, there
is a need to know the previous trends and the seasonality of data of the overall data set to make
the right predictions.

Before going into the brief explanation of LSTM cell, Let us see how the LSTM cell looks like :

The Architecture may look little complicated on the 몭rst glance, but it is pretty neat and clear and
easily understandable if we break it into parts.

Lets 몭rst start understanding what are our inputs and outputs. The typical input if you see on the
left-hand side of the diagram Ct‐1 which is the previous cell state and ht‐1 which is the output
from the previous cell and Xt which is the input of the present cell.
The output of the cell is Ct and ht which are the corresponding cell state and output of the present
cell. The 몭rst step of an LSTM is the forget gate layer (f) where we determine what are we going
to forget from the previous cell state. This typically takes the input ht‐1 and Xt and make a linear
transformation with some weights and bias terms and pass into the sigmoid function. As we are
aware the output of a sigmoid function is always between 0 and 1. Here 0 will be considered as
to forget it and 1 will represent to keep it

Forget gate later=> f = Sigmoid ( Weights (ht-1,Xt) + bias)

The second step is a two-part process and this is the step which tells us actually processing within
this layer. Here in the 몭rst part we take the same inputs as before the ht‐1 and Xt and make a
linear transformation with some weights and biases and pass on to a sigmoid function . And the
second part we will make a linear transformation again between ht‐1 and Xt with some weights
and biases but this time its going to be a hyperbolic tangent function (tanh). At the end of this
step, we will get vectors of values which can be new candidate values for this present cell.

First part => I = sigmoid( Weights (ht-1,Xt) + bias)

Second part => II = tanh( Weights (ht-1,Xt) + bias)

The third step is the update step which helps us in deriving the new cell state Ct using our
previous steps. First, we will multiply the previous cell state with the forget gate layer and add the
vectors we got from the second step which forms the new cell state Ct of the present cell at t .

Update layer => Ct = Ct-1 f + I II

The 몭nal step is another main output of the cell, for this, we will directly form a linear
transformation with the previous output ht‐1 and input of the present cell Xt with some bias and
weight terms and pass on to a sigmoid layer. Finally, now we will multiply this output to the new cell
state Ct which is passed on to a hyperbolic tangent function. This gives us the present output ht .

Final layer =>

i = sigmoid ( Weights (ht-1,xt) + bias)


몭nal ht = i * tanh(Ct)

Now we have a clear understanding of the step by step dissection of the LSTM
layer. Let's see how we apply our LSTM cell into a time series data.

How to? Let's Begin

Importing Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from numpy import nan

from tensorflow.keras import Sequential


from tensorflow.keras.layers import LSTM, Dense

from sklearn.metrics import mean_squared_error


from sklearn.preprocessing import MinMaxScaler

#Reading the dataset

data = pd.read_csv('household_power_consumption.txt', sep = ';',


parse_dates = True,
low_memory = False)

#printing top rows

data.head()

Date Time Global_active_power Global_reactive_power Voltage Global_

0 16/12/2006 17:24:00 4.216 0.418 234.840 18.400

1 16/12/2006 17:25:00 5.360 0.436 233.630 23.000

2 16/12/2006 17:26:00 5.374 0.498 233.290 23.000

3 16/12/2006 17:27:00 5.388 0.502 233.740 23.000

4 16/12/2006 17:28:00 3.666 0.528 235.680 15.800


#concatenating the date and time columns to 'date_time' columns

data['date_time'] = data['Date'].str.cat(data['Time'], sep= ' ')


data.drop(['Date', 'Time'], inplace= True, axis = 1)
data.head()

Global_active_power Global_reactive_power Voltage Global_intensity Sub_m

0 4.216 0.418 234.840 18.400 0.000

1 5.360 0.436 233.630 23.000 0.000

2 5.374 0.498 233.290 23.000 0.000

3 5.388 0.502 233.740 23.000 0.000

4 3.666 0.528 235.680 15.800 0.000

data.set_index(['date_time'], inplace=True)
data.head()

Global_active_power Global_reactive_power Voltage Global_intensi

date_time

16/12/2006
4.216 0.418 234.840 18.400
17:24:00

16/12/2006
5.360 0.436 233.630 23.000
17:25:00

16/12/2006
5.374 0.498 233.290 23.000
17:26:00

16/12/2006
5.388 0.502 233.740 23.000
17:27:00

16/12/2006
3.666 0.528 235.680 15.800
17:28:00

Next, we can mark all missing values indicated with a ‘?‘ character with a NaN value, which is a
몭oat.

#replacing each '?'characters with NaN value


data.replace('?', nan, inplace=True)

#This will allow us to work with the data as one array of floating point values rather than m
ixed types (less efficient.)

data = data.astype('float')

#information of the dataset

data.info()

<class 'pandas.core.frame.DataFrame'>
Index: 2075259 entries, 16/12/2006 17:24:00 to 26/11/2010 21:02:00
Data columns (total 7 columns):
Global_active_power float64
Global_reactive_power float64
Voltage float64
Global_intensity float64
Sub_metering_1 float64
Sub_metering_2 float64
Sub_metering_3 float64
dtypes: float64(7)
memory usage: 126.7+ MB

#checking the null values

np.isnan(data).sum()

Global_active_power 25979
Global_reactive_power 25979
Voltage 25979
Global_intensity 25979
Sub_metering_1 25979
Sub_metering_2 25979
Sub_metering_3 25979
dtype: int64

We also need to 몭ll in the missing values now that they have been marked.

A very simple approach would be to copy the observation from the same time the day before. We
can implement this in a function named 몭ll_missing() that will take the NumPy array of the data
and copy values from exactly 24 hours ago.

def fill_missing(data):
one_day = 24*60
for row in range(data.shape[0]):
for col in range(data.shape[1]):
if np.isnan(data[row, col]):
data[row, col] = data[row‐one_day, col]

fill_missing(data.values)

#checking the nan values

np.isnan(data).sum()

Global_active_power 0
Global_reactive_power 0
Voltage 0
Global_intensity 0
Sub_metering_1 0
Sub_metering_2 0
Sub_metering_3 0
dtype: int64

data.info()

<class 'pandas.core.frame.DataFrame'>
Index: 2075259 entries, 16/12/2006 17:24:00 to 26/11/2010 21:02:00
Data columns (total 7 columns):
Global_active_power float64
Global_reactive_power float64
Voltage float64
Global_intensity float64
Sub_metering_1 float64
Sub_metering_2 float64
Sub_metering_3 float64
dtypes: float64(7)
memory usage: 126.7+ MB

#printing the shape of the data

data.shape

(2075259, 7)
Here, we can observe that we have 2075259 datapoints and 7 features

data.head()

Global_active_power Global_reactive_power Voltage Global_intensi

date_time

16/12/2006
4.216 0.418 234.84 18.4
17:24:00

16/12/2006
5.360 0.436 233.63 23.0
17:25:00

16/12/2006
5.374 0.498 233.29 23.0
17:26:00

16/12/2006
5.388 0.502 233.74 23.0
17:27:00

16/12/2006
3.666 0.528 235.68 15.8
17:28:00

Prepare power consumption for each day


We can now save the cleaned-up version of the dataset to a new 몭le; in this case we will just
change the 몭le extension to .csv and save the dataset as ‘cleaned_data.csv‘ .

#conversion of dataframe to .csv

data.to_csv('cleaned_data.csv')

#reading the dataset

dataset = pd.read_csv('cleaned_data.csv', parse_dates = True, index_col = 'date_time', low_me


mory = False)

#printing the top rows

dataset.head()
Global_active_power Global_reactive_power Voltage Global_intensi

date_time

2006‐12‐16
4.216 0.418 234.84 18.4
17:24:00

2006‐12‐16
5.360 0.436 233.63 23.0
17:25:00

2006‐12‐16
5.374 0.498 233.29 23.0
17:26:00

2006‐12‐16
5.388 0.502 233.74 23.0
17:27:00

2006‐12‐16
3.666 0.528 235.68 15.8
17:28:00

#printing the bottom rows

dataset.tail()

Global_active_power Global_reactive_power Voltage Global_intensi

date_time

2010‐11‐26
0.946 0.0 240.43 4.0
20:58:00

2010‐11‐26
0.944 0.0 240.00 4.0
20:59:00

2010‐11‐26
0.938 0.0 239.82 3.8
21:00:00

2010‐11‐26
0.934 0.0 239.70 3.8
21:01:00

2010‐11‐26
0.932 0.0 239.55 3.8
21:02:00

Exploratory Data Analysis


#Downsampling the data into dáy‐wise bins and sum the values of the timestamps falling into a
bin.

data = dataset.resample('D').sum()
#data after sampling it into daywise manner

data.head()

Global_active_power Global_reactive_power Voltage Global_intensi

date_time

2006‐12‐16 1209.176 34.922 93552.53 5180.8

2006‐12‐17 3390.460 226.006 345725.32 14398.6

2006‐12‐18 2203.826 161.792 347373.64 9247.2

2006‐12‐19 1666.194 150.942 348479.01 7094.0

2006‐12‐20 2225.748 160.998 348923.61 9313.0

Plotting the all features in various time stamps

fig, ax = plt.subplots(figsize=(18,18))

for i in range(len(data.columns)):
plt.subplot(len(data.columns), 1, i+1)
name = data.columns[i]
plt.plot(data[name])
plt.title(name, y=0, loc = 'right')
plt.yticks([])
plt.show()
fig.tight_layout()
Exploring Active power consumption for each year

#we have considered 5 years here

years = ['2007', '2008', '2009', '2010']

Year wise plotting of feature Global_active_power

fig, ax = plt.subplots(figsize=(18,18))

for i in range(len(years)):
plt.subplot(len(years), 1, i+1)
year = years[i]
active_power_data = data[str(year)]
plt.plot(active_power_data['Global_active_power'])
plt.title(str(year), y = 0, loc = 'left')
plt.show()
fig.tight_layout()
#for year 2006

data['2006']

Global_active_power Global_reactive_power Voltage Global_intensi

date_time

2006‐12‐16 1209.176 34.922 93552.53 5180.8

2006‐12‐17 3390.460 226.006 345725.32 14398.6

2006‐12‐18 2203.826 161.792 347373.64 9247.2

2006‐12‐19 1666.194 150.942 348479.01 7094.0

2006‐12‐20 2225.748 160.998 348923.61 9313.0

2006‐12‐21 1723.288 144.434 347096.41 7266.4

2006‐12‐22 2341.338 186.906 347305.75 9897.0

2006‐12‐23 4773.386 221.470 345795.95 20200.4

2006‐12‐24 2550.012 149.900 348029.91 11002.2

2006‐12‐25 2743.120 240.280 350495.90 11450.2

2006‐12‐26 3934.110 165.102 347940.63 16341.0

2006‐12‐27 1528.760 178.902 351025.00 6505.2

2006‐12‐28 2072.638 208.876 350306.40 8764.2


Global_active_power Global_reactive_power Voltage Global_intensi

date_time

2006‐12‐29 3174.392 196.394 346854.68 13350.8

2006‐12‐30 2796.108 312.142 346377.15 11952.6

2006‐12‐31 3494.196 150.852 345451.07 14687.4

Power consumption distribution with histogram

Year wise histogram plot of feature Global_active_power

fig, ax = plt.subplots(figsize=(18,18))

for i in range(len(years)):
plt.subplot(len(years), 1, i+1)
year = years[i]
active_power_data = data[str(year)]
active_power_data['Global_active_power'].hist(bins = 200)
plt.title(str(year), y = 0, loc = 'left')
plt.show()
fig.tight_layout()
Histogram plot for All Features

fig, ax = plt.subplots(figsize=(18,18))

for i in range(len(data.columns)):
plt.subplot(len(data.columns), 1, i+1)
name = data.columns[i]
data[name].hist(bins=200)
plt.title(name, y=0, loc = 'right')
plt.yticks([])
plt.show()
fig.tight_layout()
Plot power consumption hist for each month of 2007

months = [i for i in range(1,13)]

fig, ax = plt.subplots(figsize=(18,18))

for i in range(len(months)):
ax = plt.subplot(len(months), 1, i+1)
month = '2007‐' + str(months[i])
active_power_data = dataset[month]
active_power_data['Global_active_power'].hist(bins = 100)
ax.set_xlim(0,5)
plt.title(month, y = 0, loc = 'right')
plt.show()
fig.tight_layout()
Observation :

1. From the above diagram we can say that power consumption in the month of
Nov, Dec, Jan, Feb, Mar is more as there is a long tail as compare to other months.

2. It also shows that the during the winter seasons, the heating systems are used
and not in summer.

3. The above graph is highly concentrated on 0.3W and 1.3W.

Active Power Uses Prediction


What can we predict

Forecast hourly consumption for the next day.


Forecast daily consumption for the next week.
Forecast daily consumption for the next month.
Forecast monthly consumption for the next year.

Modeling Methods

There are many modeling methods and few of those are as follows

Naive Methods -> Naive methods would include methods that make very simple, but often
very effective assumptions.
Classical Linear Methods -> Classical linear methods include techniques are very effective for
univariate time series forecasting
Machine Learning Methods -> Machine learning methods require that the problem be framed
as a supervised learning problem.
K-nearest neighbors.
SVM
Decision trees
Random forest
Gradient boosting machines
Deep Learning Methods -> combinations of CNN LSTM and ConvLSTM, have proven
effective on time series classi몭cation tasks
CNN
LSTM
CNN - LSTM

Problem Framing:
Given recent power consumption, what is the expected power consumption
for the week ahead?

This requires that a predictive model forecast the total active power for each day over the next
seven days

A model of this type could be helpful within the household in planning expenditures. It could also
be helpful on the supply side for planning electricity demand for a speci몭c household.

Input -> Predict

[Week1] -> Week2

[Week2] -> Week3

[Week3] -> Week4

#top rows

data.head()

Global_active_power Global_reactive_power Voltage Global_intensi

date_time

2006‐12‐16 1209.176 34.922 93552.53 5180.8

2006‐12‐17 3390.460 226.006 345725.32 14398.6

2006‐12‐18 2203.826 161.792 347373.64 9247.2

2006‐12‐19 1666.194 150.942 348479.01 7094.0

2006‐12‐20 2225.748 160.998 348923.61 9313.0

#printing last rows

data.tail()
Global_active_power Global_reactive_power Voltage Global_intensi

date_time

2010‐12‐07 1109.574 285.912 345914.85 4892.0

2010‐12‐08 529.698 169.098 346744.70 2338.2

2010‐12‐09 1612.092 201.358 347932.40 6848.2

2010‐12‐10 1579.692 170.268 345975.37 6741.2

2010‐12‐11 1836.822 151.144 343926.57 7826.2

#here are splitting the dataset


#dataset upto end of 2009 is in train dataset and remaining we keeping it in test dataset

data_train = data.loc[:'2009‐12‐31', :]['Global_active_power']


data_train.head()

date_time
2006‐12‐16 1209.176
2006‐12‐17 3390.460
2006‐12‐18 2203.826
2006‐12‐19 1666.194
2006‐12‐20 2225.748
Freq: D, Name: Global_active_power, dtype: float64

data_test = data['2010']['Global_active_power']
data_test.head()

date_time
2010‐01‐01 1224.252
2010‐01‐02 1693.778
2010‐01‐03 1298.728
2010‐01‐04 1687.440
2010‐01‐05 1320.158
Freq: D, Name: Global_active_power, dtype: float64

data_train.shape

(1112,)
data_test.shape

(345,)

Observation :

1. We have 1112 datapoints in train dataset and 345 datapoints in test dataset

Prepare training data

#training data

data_train.head(14)

date_time
2006‐12‐16 1209.176
2006‐12‐17 3390.460
2006‐12‐18 2203.826
2006‐12‐19 1666.194
2006‐12‐20 2225.748
2006‐12‐21 1723.288
2006‐12‐22 2341.338
2006‐12‐23 4773.386
2006‐12‐24 2550.012
2006‐12‐25 2743.120
2006‐12‐26 3934.110
2006‐12‐27 1528.760
2006‐12‐28 2072.638
2006‐12‐29 3174.392
Freq: D, Name: Global_active_power, dtype: float64

#converting the data into numpy array

data_train = np.array(data_train)

#we are splitting the data weekly wise(7days)

X_train, y_train = [], []

for i in range(7, len(data_train)‐7):


X_train.append(data_train[i‐7:i])
y_train.append(data_train[i:i+7])
#converting list to numpy array

X_train, y_train = np.array(X_train), np.array(y_train)

#shape of train and test dataset

X_train.shape, y_train.shape

((1098, 7), (1098, 7))

#printing the ytrain value

pd.DataFrame(y_train).head()

0 1 2 3 4 5 6

0 4773.386 2550.012 2743.120 3934.110 1528.760 2072.638 3174.392

1 2550.012 2743.120 3934.110 1528.760 2072.638 3174.392 2796.108

2 2743.120 3934.110 1528.760 2072.638 3174.392 2796.108 3494.196

3 3934.110 1528.760 2072.638 3174.392 2796.108 3494.196 2749.004

4 1528.760 2072.638 3174.392 2796.108 3494.196 2749.004 1824.760

#Normalising the dataset between 0 and 1

x_scaler = MinMaxScaler()
X_train = x_scaler.fit_transform(X_train)

#Normalising the dataset

y_scaler = MinMaxScaler()
y_train = y_scaler.fit_transform(y_train)

pd.DataFrame(X_train).head()

0 1 2 3 4 5 6

0 0.211996 0.694252 0.431901 0.313037 0.436748 0.325660 0.462304


0 1 2 3 4 5 6
1 0.694252 0.431901 0.313037 0.436748 0.325660 0.462304 1.000000

2 0.431901 0.313037 0.436748 0.325660 0.462304 1.000000 0.508439

3 0.313037 0.436748 0.325660 0.462304 1.000000 0.508439 0.551133

4 0.436748 0.325660 0.462304 1.000000 0.508439 0.551133 0.814446

#converting to 3 dimension

X_train = X_train.reshape(1098, 7, 1)

X_train.shape

(1098, 7, 1)

Build LSTM Model

#building sequential model using Keras

reg = Sequential()
reg.add(LSTM(units = 200, activation = 'relu', input_shape=(7,1)))
reg.add(Dense(7))

#here we have considered loss as mean square error and optimizer as adam

reg.compile(loss='mse', optimizer='adam')

#training the model

reg.fit(X_train, y_train, epochs = 100)

Train on 1098 samples


Epoch 1/100
1098/1098 [==============================] ‐ 2s 2ms/sample ‐ loss: 0.0626
Epoch 2/100
1098/1098 [==============================] ‐ 0s 296us/sample ‐
.
.
.
.
.
Epoch 99/100
1098/1098 [==============================] ‐ 0s 270us/sample ‐ loss: 0.0228
Epoch 100/100
1098/1098 [==============================] ‐ 0s 269us/sample ‐ loss: 0.0228

<tensorflow.python.keras.callbacks.History at 0x19ba56fc668>

Observation:
1. We have done with training and loss which we have got is 0.0232

Prepare test dataset and test LSTM model

#testing dataset

data_test = np.array(data_test)

#here we are splitting the data weekly wise(7days)

X_test, y_test = [], []

for i in range(7, len(data_test)‐7):


X_test.append(data_test[i‐7:i])
y_test.append(data_test[i:i+7])

X_test, y_test = np.array(X_test), np.array(y_test)

X_test = x_scaler.transform(X_test)
y_test = y_scaler.transform(y_test)

#converting to 3 dimension

X_test = X_test.reshape(331,7,1)

X_test.shape

(331, 7, 1)
y_pred = reg.predict(X_test)

#bringing y_pred values to their original forms by using inverse transform

y_pred = y_scaler.inverse_transform(y_pred)

y_pred

array([[1508.9413 , 1476.1537 , 1487.5676 , ..., 1484.8464 , 1459.3864 ,


1551.5675 ],
[1158.2788 , 1287.0326 , 1346.428 , ..., 1430.5685 , 1420.6346 ,
1472.5759 ],
[1571.7665 , 1507.0337 , 1516.5574 , ..., 1432.5813 , 1393.9161 ,
1504.1714 ],
...,
[ 952.85785, 852.4236 , 933.62585, ..., 800.12006, 831.2844 ,
1005.20844],
[1579.4896 , 1353.6078 , 1278.9501 , ..., 981.4198 , 967.6466 ,
1146.7898 ],
[1629.0509 , 1392.7751 , 1288.7218 , ..., 1052.977 , 1070.8586 ,
1243.1346 ]], dtype=float32)

y_true = y_scaler.inverse_transform(y_test)

y_true

array([[ 555.664, 1593.318, 1504.82 , ..., 0. , 1995.796, 2116.224],


[1593.318, 1504.82 , 1383.18 , ..., 1995.796, 2116.224, 2196.76 ],
[1504.82 , 1383.18 , 0. , ..., 2116.224, 2196.76 , 2150.112],
...,
[1892.998, 1645.424, 1439.426, ..., 1973.382, 1109.574, 529.698],
[1645.424, 1439.426, 2035.418, ..., 1109.574, 529.698, 1612.092],
[1439.426, 2035.418, 1973.382, ..., 529.698, 1612.092, 1579.692]])

Evaluate the model


Here, we using metric as mean square error since it is a regression problem

def evaluate_model(y_true, y_predicted):


scores = []
#calculate scores for each day
for i in range(y_true.shape[1]):
mse = mean_squared_error(y_true[:, i], y_predicted[:, i])
rmse = np.sqrt(mse)
scores.append(rmse)

#calculate score for whole prediction


total_score = 0
for row in range(y_true.shape[0]):
for col in range(y_predicted.shape[1]):
total_score = total_score + (y_true[row, col] ‐ y_predicted[row, col])**2
total_score = np.sqrt(total_score/(y_true.shape[0]*y_predicted.shape[1]))

return total_score, scores

evaluate_model(y_true, y_pred)

(579.2827596682928,
[598.0411885086157,
592.5770673397814,
576.1153945912635,
563.9396525162248,
576.5479538079353,
570.7699415990154,
576.2430188855649])

#standard deviation

np.std(y_true[0])

710.0253857243853

Conclusions:
1. From the above experiment, we have got root mean square error around 598 watts .
2. In order to check whether our model is performing good or bad, we need to
evaluate standard deviation which we have got here as 710 watts .
3. Here mean square error is less than standard deviation . Hence, we can say that our
model is performing good .

Categories: DEEP LEARNING KERAS MACHINE LEARNING NUMPY PANDAS PYTHON


SEABORN AND PLOTLY TENSORFLOW 2

Tags: Deep Learning Keras lstm Multistep prediction numpy pandas Power consumption python

Regression rnn roshan Tensor몭ow Time Series

0 Comments

Leave a Reply

You must be logged in to post a comment.

Search Articles

Search …
Python for Machine Learning: A Step-by-Step Guide
Learn to build Machine Learning and Deep Learning models using Python and its libraries like Scikit-
Learn, Keras, and TensorFlow.

Natural Language Processing in Python for Beginners


Learn to build Natural Language Processing Projects using Spacy, NLTK, and Gensim, and transformer
models like BERT, GPT-2, and XLNet.

Data Visualization in Python Masterclass™: Beginners to Pro


Learn to build Machine Learning and Deep Learning models using Python and its libraries like Scikit-
Learn, Keras, and TensorFlow.
Python for Linear Regression in Machine Learning
Learn to build Linear Regression models using Python and its libraries like Scikit-Learn.

Introduction to Spacy 3 for Natural Language Processing


Learn to build Natural Language Processing models using Python and its libraries like Spacy.

Advanced Machine Learning and Deep Learning Projects


Learn to build Advanced Machine Learning and Deep Learning models using Python and transformer
models like BERT, GPT-2, and XLNet.
Deployment of Machine Learning Models in Production in Python
Learn to deploy Machine Learning and Deep Learning models using Python and its libraries like Flask,
Streamlit, and NGINX.

R 4.0 Programming for Data Science - Beginners to Pro


Learn to build Machine Learning and Deep Learning models using R and its libraries like caret, tidyverse,
and keras.

Categories

Select Category

Recent Posts

How to Become a Successful Machine Learning Engineer

Interview Questions and Answers on TF-IDF in NLP and Machine Learning

Top 10 Interview Questions and Answers for MLOps Engineers

Feature Engineering Tutorial Series 6: Variable magnitude

Feature Engineering Tutorial Series 5: Outliers

Tags
aarya aarya tadvalkar api arm Batch Normalisation Classi몭cation classi몭er cnn data science

data visualization Deep Learning Embedded Feature selection imdb dataset iot

Keras kgp talkie kgptalkie lstm Machine Learning Md. Ejazul


Hassan Memory Natural Language Processing neural networks nlp numpy

pandas Processor python Regression rnn roshan seaborn sentiment classi몭cation


sklearn spaCy Spatial locality stm32 stm32cubemx temporal locality Tensor몭ow Text processing
t몭df Titanic Dataset Transfer Rate

Archives

January 2023

December 2022

October 2020

September 2020

August 2020

October 2016

September 2016

Related Posts

MACHINE LEARNING

How to Become a Successful Machine Learning Engineer


Machine learning is a rapidly growing 몭eld that offers a wide range of opportunities for those who have
the right skills and quali몭cations. As a machine learning engineer, you will be responsible for designing
and Read more…
MACHINE LEARNING

Interview Questions and Answers on TF-IDF in NLP and Machine


Learning
What is TFIDF? TF-IDF, short for term frequency-inverse document frequency, is a numerical statistic
that is used to re몭ect how important a word is to a document in a corpus of documents. It is commonly
Read more…

MACHINE LEARNING

Top 10 Interview Questions and Answers for MLOps Engineers


MLOps, or Machine Learning Operations, is the practice of combining machine learning and operations
to enable the rapid, reliable, and secure development, deployment, and management of machine
learning models. MLOps aims to streamline the process Read more…

Search Blogs

Search …

HOME DATA VISUALIZATION IN PYTHON ADVANCED NLP NLP IN PYTHON FOR BEGINNERS

Hestia | Developed by ThemeIsle

You might also like