Energy Consumption Prediction Using Machine
Learning (Random Forest Model)
Group Members:
1. Vishal Ravindra Bhadane
2. Paritosh Nitin Chaudhari
3. Vinay Nandkishor Shah
4. Sarvesh Umesh Gujrathi
Abstract:
This study predicts power consumption in Tétouan City using an open-source dataset containing
weather and diffusion-related features. Five machine learning models—Linear Regression,
Decision Trees, K-Nearest Neighbors, Support Vector Machine, and Random Forest—were
trained and evaluated. Results show that the Random Forest model achieved the lowest error
rate and fastest training/testing time, making it the most effective and accurate model for
predicting regional weekly and monthly power consumption.
Libraries Installation
!pip install pandas scikit-learn
Requirement already satisfied: pandas in c:\users\parit\appdata\local\
programs\python\python311\lib\site-packages (2.2.3)
Collecting scikit-learn
Downloading scikit_learn-1.7.1-cp311-cp311-win_amd64.whl.metadata
(11 kB)
Requirement already satisfied: numpy>=1.23.2 in c:\users\parit\
appdata\local\programs\python\python311\lib\site-packages (from
pandas) (2.1.3)
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\
parit\appdata\local\programs\python\python311\lib\site-packages (from
pandas) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in c:\users\parit\appdata\
local\programs\python\python311\lib\site-packages (from pandas)
(2025.2)
Requirement already satisfied: tzdata>=2022.7 in c:\users\parit\
appdata\local\programs\python\python311\lib\site-packages (from
pandas) (2025.2)
Collecting scipy>=1.8.0 (from scikit-learn)
Downloading scipy-1.16.1-cp311-cp311-win_amd64.whl.metadata (60 kB)
Requirement already satisfied: joblib>=1.2.0 in c:\users\parit\
appdata\local\programs\python\python311\lib\site-packages (from
scikit-learn) (1.5.1)
Collecting threadpoolctl>=3.1.0 (from scikit-learn)
Downloading threadpoolctl-3.6.0-py3-none-any.whl.metadata (13 kB)
Requirement already satisfied: six>=1.5 in c:\users\parit\appdata\
local\programs\python\python311\lib\site-packages (from python-
dateutil>=2.8.2->pandas) (1.16.0)
Downloading scikit_learn-1.7.1-cp311-cp311-win_amd64.whl (8.9 MB)
---------------------------------------- 0.0/8.9 MB ? eta -:--:--
- -------------------------------------- 0.3/8.9 MB ? eta -:--:--
---- ----------------------------------- 1.0/8.9 MB 3.9 MB/s eta
0:00:03
--------------- ------------------------ 3.4/8.9 MB 6.9 MB/s eta
0:00:01
----------------- ---------------------- 3.9/8.9 MB 5.5 MB/s eta
0:00:01
---------------------- ----------------- 5.0/8.9 MB 5.3 MB/s eta
0:00:01
------------------------- -------------- 5.8/8.9 MB 5.6 MB/s eta
0:00:01
----------------------------- ---------- 6.6/8.9 MB 5.0 MB/s eta
0:00:01
---------------------------------------- 8.9/8.9 MB 5.7 MB/s eta
0:00:00
Downloading scipy-1.16.1-cp311-cp311-win_amd64.whl (38.6 MB)
---------------------------------------- 0.0/38.6 MB ? eta -:--:--
- -------------------------------------- 1.6/38.6 MB 8.4 MB/s eta
0:00:05
--- ------------------------------------ 3.7/38.6 MB 9.1 MB/s eta
0:00:04
----- ---------------------------------- 5.0/38.6 MB 9.1 MB/s eta
0:00:04
------ --------------------------------- 6.0/38.6 MB 7.1 MB/s eta
0:00:05
------- -------------------------------- 7.6/38.6 MB 7.5 MB/s eta
0:00:05
-------- ------------------------------- 8.7/38.6 MB 7.2 MB/s eta
0:00:05
-------- ------------------------------- 8.7/38.6 MB 7.2 MB/s eta
0:00:05
--------- ------------------------------ 8.9/38.6 MB 5.7 MB/s eta
0:00:06
--------- ------------------------------ 9.4/38.6 MB 4.9 MB/s eta
0:00:06
---------- ----------------------------- 10.5/38.6 MB 4.8 MB/s eta
0:00:06
----------- ---------------------------- 11.3/38.6 MB 4.9 MB/s eta
0:00:06
------------- -------------------------- 12.8/38.6 MB 5.0 MB/s eta
0:00:06
-------------- ------------------------- 14.4/38.6 MB 5.2 MB/s eta
0:00:05
---------------- ----------------------- 15.7/38.6 MB 5.3 MB/s eta
0:00:05
----------------- ---------------------- 17.3/38.6 MB 5.5 MB/s eta
0:00:04
------------------- -------------------- 18.9/38.6 MB 5.5 MB/s eta
0:00:04
--------------------- ------------------ 20.4/38.6 MB 5.6 MB/s eta
0:00:04
---------------------- ----------------- 22.0/38.6 MB 5.7 MB/s eta
0:00:03
------------------------ --------------- 23.3/38.6 MB 5.9 MB/s eta
0:00:03
-------------------------- ------------- 25.2/38.6 MB 5.9 MB/s eta
0:00:03
--------------------------- ------------ 26.7/38.6 MB 6.0 MB/s eta
0:00:02
----------------------------- ---------- 28.0/38.6 MB 6.0 MB/s eta
0:00:02
------------------------------ --------- 29.9/38.6 MB 6.1 MB/s eta
0:00:02
-------------------------------- ------- 31.5/38.6 MB 6.1 MB/s eta
0:00:02
---------------------------------- ----- 33.0/38.6 MB 6.2 MB/s eta
0:00:01
----------------------------------- ---- 33.8/38.6 MB 6.1 MB/s eta
0:00:01
------------------------------------ --- 35.4/38.6 MB 6.1 MB/s eta
0:00:01
-------------------------------------- - 37.0/38.6 MB 6.2 MB/s eta
0:00:01
--------------------------------------- 38.3/38.6 MB 6.2 MB/s eta
0:00:01
---------------------------------------- 38.6/38.6 MB 6.1 MB/s eta
0:00:00
Downloading threadpoolctl-3.6.0-py3-none-any.whl (18 kB)
Installing collected packages: threadpoolctl, scipy, scikit-learn
Successfully installed scikit-learn-1.7.1 scipy-1.16.1 threadpoolctl-
3.6.0
[notice] A new release of pip is available: 25.0.1 -> 25.2
[notice] To update, run: python.exe -m pip install --upgrade pip
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
data = pd.read_csv("D:\Project Stage-II\Tetuan City Power
Consumption.csv")
data.head(5)
DateTime Temperature Humidity Wind Speed general diffuse
flows \
0 1/1/2017 0:00 6.559 73.8 0.083
0.051
1 1/1/2017 0:10 6.414 74.5 0.083
0.070
2 1/1/2017 0:20 6.313 74.5 0.080
0.062
3 1/1/2017 0:30 6.121 75.0 0.083
0.091
4 1/1/2017 0:40 5.921 75.7 0.081
0.048
diffuse flows Zone 1 Power Consumption Zone 2 Power Consumption
\
0 0.119 34055.69620 16128.87538
1 0.085 29814.68354 19375.07599
2 0.100 29128.10127 19006.68693
3 0.096 28228.86076 18361.09422
4 0.085 27335.69620 17872.34043
Zone 3 Power Consumption
0 20240.96386
1 20131.08434
2 19668.43373
3 18899.27711
4 18442.40964
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 52416 entries, 0 to 52415
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 DateTime 52416 non-null object
1 Temperature 52416 non-null float64
2 Humidity 52416 non-null float64
3 Wind Speed 52416 non-null float64
4 general diffuse flows 52416 non-null float64
5 diffuse flows 52416 non-null float64
6 Zone 1 Power Consumption 52416 non-null float64
7 Zone 2 Power Consumption 52416 non-null float64
8 Zone 3 Power Consumption 52416 non-null float64
dtypes: float64(8), object(1)
memory usage: 3.6+ MB
data.columns
Index(['DateTime', 'Temperature', 'Humidity', 'Wind Speed',
'general diffuse flows', 'diffuse flows', 'Zone 1 Power
Consumption',
'Zone 2 Power Consumption', 'Zone 3 Power Consumption'],
dtype='object')
data["DateTime"] = pd.to_datetime(data["DateTime"],format="%m/%d/%Y
%H:%M")
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 52416 entries, 0 to 52415
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 DateTime 52416 non-null datetime64[ns]
1 Temperature 52416 non-null float64
2 Humidity 52416 non-null float64
3 Wind Speed 52416 non-null float64
4 general diffuse flows 52416 non-null float64
5 diffuse flows 52416 non-null float64
6 Zone 1 Power Consumption 52416 non-null float64
7 Zone 2 Power Consumption 52416 non-null float64
8 Zone 3 Power Consumption 52416 non-null float64
dtypes: datetime64[ns](1), float64(8)
memory usage: 3.6 MB
data
DateTime Temperature Humidity Wind Speed \
0 2017-01-01 00:00:00 6.559 73.8 0.083
1 2017-01-01 00:10:00 6.414 74.5 0.083
2 2017-01-01 00:20:00 6.313 74.5 0.080
3 2017-01-01 00:30:00 6.121 75.0 0.083
4 2017-01-01 00:40:00 5.921 75.7 0.081
... ... ... ... ...
52411 2017-12-30 23:10:00 7.010 72.4 0.080
52412 2017-12-30 23:20:00 6.947 72.6 0.082
52413 2017-12-30 23:30:00 6.900 72.8 0.086
52414 2017-12-30 23:40:00 6.758 73.0 0.080
52415 2017-12-30 23:50:00 6.580 74.1 0.081
general diffuse flows diffuse flows Zone 1 Power Consumption
\
0 0.051 0.119 34055.69620
1 0.070 0.085 29814.68354
2 0.062 0.100 29128.10127
3 0.091 0.096 28228.86076
4 0.048 0.085 27335.69620
... ... ... ...
52411 0.040 0.096 31160.45627
52412 0.051 0.093 30430.41825
52413 0.084 0.074 29590.87452
52414 0.066 0.089 28958.17490
52415 0.062 0.111 28349.80989
Zone 2 Power Consumption Zone 3 Power Consumption
0 16128.87538 20240.96386
1 19375.07599 20131.08434
2 19006.68693 19668.43373
3 18361.09422 18899.27711
4 17872.34043 18442.40964
... ... ...
52411 26857.31820 14780.31212
52412 26124.57809 14428.81152
52413 25277.69254 13806.48259
52414 24692.23688 13512.60504
52415 24055.23167 13345.49820
[52416 rows x 9 columns]
data["DateTime"] = pd.to_datetime(data["DateTime"])
# Convert to DateTime to Unix timestamp for the fit model
data["DateTime"] = data["DateTime"].apply(lambda x: x.timestamp())
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 52416 entries, 0 to 52415
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 DateTime 52416 non-null float64
1 Temperature 52416 non-null float64
2 Humidity 52416 non-null float64
3 Wind Speed 52416 non-null float64
4 general diffuse flows 52416 non-null float64
5 diffuse flows 52416 non-null float64
6 Zone 1 Power Consumption 52416 non-null float64
7 Zone 2 Power Consumption 52416 non-null float64
8 Zone 3 Power Consumption 52416 non-null float64
dtypes: float64(9)
memory usage: 3.6 MB
Predicting future energy-related features (like
Temperature, Humidity, Wind Speed, and Diffuse Flows)
using an LSTM (Long Short-Term Memory) model.
1. Training Data → The data the model learns from.
2. Testing Data → Completely new/unseen data used to check how well the model
generalizes.
Training Loss
• It’s the error (difference between predicted values and actual values) the model makes
on the training dataset.
• Calculated after each batch/epoch.
• It shows how well the model is fitting the data it was trained on.
• Ideally, training loss decreases as epochs increase.
Testing Loss
• It’s the error on the test dataset (data the model has never seen during training).
• Used to check if the model is generalizing well.
• If test loss is close to training loss, the model is performing well.
• If test loss is much higher than training loss, it means overfitting (the model memorized
training data but failed on unseen data).
import pandas as pd
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from sklearn.preprocessing import MinMaxScaler
# Load dataset
# data = pd.read_csv('D:\Project Stage-II\Tetuan City Power
Consumption.csv')
# Set DateTime column as index
# data['DateTime'] = pd.to_datetime(data['DateTime'])
# data.set_index('DateTime', inplace=True)
# Keep only relevant columns
data = data[['Temperature', 'Humidity', 'Wind Speed', 'general diffuse
flows']]
# Normalize the data
scaler = MinMaxScaler()
data_scaled = scaler.fit_transform(data)
# Split into training and test sets
train_size = int(len(data_scaled) * 0.7)
train_data = data_scaled[:train_size]
test_data = data_scaled[train_size:]
# Function to create sequences for LSTM
def create_sequences(dataset, seq_length):
X = []
y = []
for i in range(len(dataset) - seq_length):
X.append(dataset[i:i+seq_length])
y.append(dataset[i+seq_length])
return np.array(X), np.array(y)
seq_length = 10 # Number of time steps
X_train, y_train = create_sequences(train_data, seq_length)
X_test, y_test = create_sequences(test_data, seq_length)
# Build LSTM model
model = Sequential()
model.add(LSTM(64, input_shape=(seq_length, data.shape[1])))
model.add(Dense(data.shape[1])) # Output layer must match number of
features
# Compile model
model.compile(loss='mse', optimizer='adam')
# Train model
model.fit(X_train, y_train, epochs=10, batch_size=32)
# Evaluate model
train_loss = model.evaluate(X_train, y_train, verbose=0)
test_loss = model.evaluate(X_test, y_test, verbose=0)
print(f'Train Loss: {train_loss}, Test Loss: {test_loss}')
# Predict future values using last sequence
future_data = data_scaled[-seq_length:] # Take last 'seq_length' data
points
future_data = future_data.reshape((1, seq_length, data.shape[1]))
predicted_data = model.predict(future_data)
# Inverse transform predictions back to original scale
predicted_data = scaler.inverse_transform(predicted_data)
print(f'Predicted value: {predicted_data}')
C:\Users\parit\AppData\Local\Programs\Python\Python311\Lib\site-
packages\keras\src\layers\rnn\rnn.py:200: UserWarning: Do not pass an
`input_shape`/`input_dim` argument to a layer. When using Sequential
models, prefer using an `Input(shape)` object as the first layer in
the model instead.
super().__init__(**kwargs)
Epoch 1/10
1147/1147 ━━━━━━━━━━━━━━━━━━━━ 22s 15ms/step - loss: 0.0177
Epoch 2/10
1147/1147 ━━━━━━━━━━━━━━━━━━━━ 21s 15ms/step - loss: 0.0011
Epoch 3/10
1147/1147 ━━━━━━━━━━━━━━━━━━━━ 20s 15ms/step - loss: 8.8915e-04
Epoch 4/10
1147/1147 ━━━━━━━━━━━━━━━━━━━━ 20s 15ms/step - loss: 8.2716e-04
Epoch 5/10
1147/1147 ━━━━━━━━━━━━━━━━━━━━ 21s 15ms/step - loss: 7.1130e-04
Epoch 6/10
1147/1147 ━━━━━━━━━━━━━━━━━━━━ 21s 16ms/step - loss: 6.8578e-04
Epoch 7/10
1147/1147 ━━━━━━━━━━━━━━━━━━━━ 20s 15ms/step - loss: 7.0814e-04
Epoch 8/10
1147/1147 ━━━━━━━━━━━━━━━━━━━━ 21s 15ms/step - loss: 6.4219e-04
Epoch 9/10
1147/1147 ━━━━━━━━━━━━━━━━━━━━ 21s 15ms/step - loss: 6.5586e-04
Epoch 10/10
1147/1147 ━━━━━━━━━━━━━━━━━━━━ 21s 15ms/step - loss: 6.3775e-04
Train Loss: 0.0006333302590064704, Test Loss: 0.0006365909357555211
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 476ms/step
Predicted value: [[ 6.626758 74.19182 0.11169309 0.9180635 ]]
Interpretation
• The small loss values show that the LSTM model learned the patterns in the data well.
• Training and testing errors are almost the same, which means the model is reliable and
not overfitted.
• The predictions look realistic and match the actual environmental conditions.
This experiment shows that using an LSTM model is a good way to forecast weather and energy-
related values in Tétouan City. The model gives accurate results, making it useful for predicting
future power consumption trends. The model achieved high prediction accuracy, making it suitable
for forecasting future values of power-consumption-related features.
import pandas as pd
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from sklearn.preprocessing import MinMaxScaler
# Load dataset
data = pd.read_csv('D:\Project Stage-II\Tetuan City Power
Consumption.csv')
# Convert DateTime column and set it as index
data["DateTime"] = pd.to_datetime(data["DateTime"], format="%m/%d/%Y
%H:%M")
data.set_index('DateTime', inplace=True)
# Keep only relevant columns
data = data[['Temperature', 'Humidity', 'Wind Speed', 'general diffuse
flows', "diffuse flows"]]
# Normalize the data
scaler = MinMaxScaler()
data_scaled = scaler.fit_transform(data)
# Function to create sequences for LSTM
def create_sequences(dataset, seq_length):
X = []
y = []
for i in range(len(dataset) - seq_length):
X.append(dataset[i:i+seq_length])
y.append(dataset[i+seq_length])
return np.array(X), np.array(y)
seq_length = 10 # Number of time steps
X, y = create_sequences(data_scaled, seq_length)
# Build LSTM model
model = Sequential()
model.add(LSTM(64, input_shape=(seq_length, data.shape[1])))
model.add(Dense(data.shape[1])) # Output layer size = number of
features
# Compile model
model.compile(loss='mse', optimizer='adam')
# Train model
model.fit(X, y, epochs=12, batch_size=32)
# Prepare data for future prediction
last_data = data_scaled[-seq_length:]
future_data = np.array([last_data])
# Predict multiple future steps (7 days = 288 * 10-min intervals)
predicted_data = []
num_predictions = 288
for _ in range(num_predictions):
prediction = model.predict(future_data)
predicted_data.append(prediction)
# Append prediction to future_data for next step
future_data = np.append(future_data[:, 1:, :],
prediction.reshape(1, 1, data.shape[1]), axis=1)
# Convert predictions back to original scale
predicted_data = np.array(predicted_data).reshape((num_predictions,
data.shape[1]))
predicted_data = scaler.inverse_transform(predicted_data)
# Create DataFrame for predictions
prediction_dates = pd.date_range(start=data.index[-1] +
pd.Timedelta(minutes=10), periods=num_predictions, freq='10min')
prediction_df = pd.DataFrame(predicted_data, columns=data.columns,
index=prediction_dates)
prediction_df.insert(0, 'DateTime', prediction_dates)
prediction_df.reset_index(drop=True, inplace=True)
print(prediction_df)
# (Optional) Evaluate model performance on predicted data
predicted_values_X, predicted_values_y =
create_sequences(predicted_data, seq_length)
predicted_values_X = np.reshape(predicted_values_X,
(predicted_values_X.shape[0], seq_length, data.shape[1]))
score = model.evaluate(predicted_values_X, predicted_values_y)
print("Model Score:", score)
Epoch 1/12
C:\Users\parit\AppData\Local\Programs\Python\Python311\Lib\site-
packages\keras\src\layers\rnn\rnn.py:200: UserWarning: Do not pass an
`input_shape`/`input_dim` argument to a layer. When using Sequential
models, prefer using an `Input(shape)` object as the first layer in
the model instead.
super().__init__(**kwargs)
1638/1638 ━━━━━━━━━━━━━━━━━━━━ 30s 15ms/step - loss: 0.0079
Epoch 2/12
1638/1638 ━━━━━━━━━━━━━━━━━━━━ 42s 15ms/step - loss: 9.6613e-04
Epoch 3/12
1638/1638 ━━━━━━━━━━━━━━━━━━━━ 40s 15ms/step - loss: 8.1364e-04
Epoch 4/12
1638/1638 ━━━━━━━━━━━━━━━━━━━━ 41s 15ms/step - loss: 7.9632e-04
Epoch 5/12
1638/1638 ━━━━━━━━━━━━━━━━━━━━ 42s 15ms/step - loss: 7.4271e-04
Epoch 6/12
1638/1638 ━━━━━━━━━━━━━━━━━━━━ 40s 15ms/step - loss: 7.2107e-04
Epoch 7/12
1638/1638 ━━━━━━━━━━━━━━━━━━━━ 41s 15ms/step - loss: 7.4393e-04
Epoch 8/12
1638/1638 ━━━━━━━━━━━━━━━━━━━━ 42s 16ms/step - loss: 6.7623e-04
Epoch 9/12
1638/1638 ━━━━━━━━━━━━━━━━━━━━ 41s 15ms/step - loss: 7.1271e-04
Epoch 10/12
1638/1638 ━━━━━━━━━━━━━━━━━━━━ 40s 15ms/step - loss: 6.8490e-04
Epoch 11/12
1638/1638 ━━━━━━━━━━━━━━━━━━━━ 42s 15ms/step - loss: 6.7754e-04
Epoch 12/12
1638/1638 ━━━━━━━━━━━━━━━━━━━━ 40s 15ms/step - loss: 6.9087e-04
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 486ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 117ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 200ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 94ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 96ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 125ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 107ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 133ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 111ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 93ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 96ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 110ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 111ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 121ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 107ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 106ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 108ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 106ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 108ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 112ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 109ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 120ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 127ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 96ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 117ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 163ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 94ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 108ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 93ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 113ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 120ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 112ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 118ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 113ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 114ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 113ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 106ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 109ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 116ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 114ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 106ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 125ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 107ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 123ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 120ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 113ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 111ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 113ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 117ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 112ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 108ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 93ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 96ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 96ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 92ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 91ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 80ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 94ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 91ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 112ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 106ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 94ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 91ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 111ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 88ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 93ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 82ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 108ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 93ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 110ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 93ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 94ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 110ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 96ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 88ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 111ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 109ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 112ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 115ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 112ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 106ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 116ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 96ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 108ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 91ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 158ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 113ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 134ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 122ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 96ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 123ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 93ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 117ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 94ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 93ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 94ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 89ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 94ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 77ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 92ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 96ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 130ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 113ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 94ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 111ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 92ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 96ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 93ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 86ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 89ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 89ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 84ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 96ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 107ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 92ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 94ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 93ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 109ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 111ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 93ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 94ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 87ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 116ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 96ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step
DateTime Temperature Humidity Wind Speed \
0 2017-12-31 00:00:00 6.424418 74.636497 0.078743
1 2017-12-31 00:10:00 6.296503 75.038246 0.076011
2 2017-12-31 00:20:00 6.163193 75.404900 0.073155
3 2017-12-31 00:30:00 6.025410 75.768738 0.070414
4 2017-12-31 00:40:00 5.882825 76.140709 0.067758
.. ... ... ... ...
283 2018-01-01 23:10:00 37.548275 99.034119 -1.371672
284 2018-01-01 23:20:00 37.522469 98.932373 -1.365601
285 2018-01-01 23:30:00 37.496876 98.831642 -1.359623
286 2018-01-01 23:40:00 37.471497 98.731911 -1.353733
287 2018-01-01 23:50:00 37.446320 98.633171 -1.347932
general diffuse flows diffuse flows
0 -1.112825 1.522756
1 -1.642341 3.253514
2 -2.363314 5.118820
3 -3.218098 7.186725
4 -4.163715 9.483515
.. ... ...
283 343.439362 260.021179
284 343.339325 259.739288
285 343.239685 259.460968
286 343.140533 259.186035
287 343.041809 258.914459
[288 rows x 6 columns]
9/9 ━━━━━━━━━━━━━━━━━━━━ 1s 15ms/step - loss: 45924.4609
Model Score: 50276.43359375
The model score is just a number that measures the model’s error.
• Smaller score = model is learning well.
• Bigger score = model is struggling.
Interpretation
• It uses past data (Temperature, Humidity, Wind Speed, and Diffuse Flows) to predict
future values every 10 minutes.
• The model doesn’t just predict one step but keeps using its own predictions to forecast
many steps ahead (multi-step forecasting).
• This allows us to see future trends for several days, not just the next value.
• The only drawback is that small errors in early predictions can grow as we go further into
the future.
The model learns from past weather/power data and predicts future values for multiple time steps,
but long forecasts may become less accurate over time.
Difference Between First and Second LSTM Model
Aspect First LSTM Model Second LSTM Model
Input Features 4 features (Temperature, 5 features (adds Diffuse
Humidity, Wind Speed, Flows)
General Diffuse Flows)
Output Predicts next single step Predicts multiple future
steps (e.g., 2–7 days)
Forecasting Style One-step forecasting Recursive multi-step
forecasting
Use Case Short-term prediction Long-term prediction
(next 10-min value) (daily/weekly trends)
Strength High accuracy for Provides continuous future
immediate prediction trend forecasts
Limitation Cannot directly forecast Errors may accumulate in
multiple days ahead long forecasts
Installing more dependencies
pip install joblib pandas scikit-learn matplotlib
Note: you may need to restart the kernel to use updated
packages.Requirement already satisfied: joblib in c:\users\parit\
appdata\local\programs\python\python311\lib\site-packages (1.5.1)
Requirement already satisfied: pandas in c:\users\parit\appdata\local\
programs\python\python311\lib\site-packages (2.2.3)
Requirement already satisfied: scikit-learn in c:\users\parit\appdata\
local\programs\python\python311\lib\site-packages (1.7.1)
Collecting matplotlib
Downloading matplotlib-3.10.5-cp311-cp311-win_amd64.whl.metadata (11
kB)
Requirement already satisfied: numpy>=1.23.2 in c:\users\parit\
appdata\local\programs\python\python311\lib\site-packages (from
pandas) (2.1.3)
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\
parit\appdata\local\programs\python\python311\lib\site-packages (from
pandas) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in c:\users\parit\appdata\
local\programs\python\python311\lib\site-packages (from pandas)
(2025.2)
Requirement already satisfied: tzdata>=2022.7 in c:\users\parit\
appdata\local\programs\python\python311\lib\site-packages (from
pandas) (2025.2)
Requirement already satisfied: scipy>=1.8.0 in c:\users\parit\appdata\
local\programs\python\python311\lib\site-packages (from scikit-learn)
(1.16.1)
Requirement already satisfied: threadpoolctl>=3.1.0 in c:\users\parit\
appdata\local\programs\python\python311\lib\site-packages (from
scikit-learn) (3.6.0)
Collecting contourpy>=1.0.1 (from matplotlib)
Downloading contourpy-1.3.3-cp311-cp311-win_amd64.whl.metadata (5.5
kB)
Collecting cycler>=0.10 (from matplotlib)
Downloading cycler-0.12.1-py3-none-any.whl.metadata (3.8 kB)
Collecting fonttools>=4.22.0 (from matplotlib)
Downloading fonttools-4.59.1-cp311-cp311-win_amd64.whl.metadata (111
kB)
Collecting kiwisolver>=1.3.1 (from matplotlib)
Downloading kiwisolver-1.4.9-cp311-cp311-win_amd64.whl.metadata (6.4
kB)
Requirement already satisfied: packaging>=20.0 in c:\users\parit\
appdata\local\programs\python\python311\lib\site-packages (from
matplotlib) (24.1)
Collecting pillow>=8 (from matplotlib)
Downloading pillow-11.3.0-cp311-cp311-win_amd64.whl.metadata (9.2
kB)
Collecting pyparsing>=2.3.1 (from matplotlib)
Downloading pyparsing-3.2.3-py3-none-any.whl.metadata (5.0 kB)
Requirement already satisfied: six>=1.5 in c:\users\parit\appdata\
local\programs\python\python311\lib\site-packages (from python-
dateutil>=2.8.2->pandas) (1.16.0)
Downloading matplotlib-3.10.5-cp311-cp311-win_amd64.whl (8.1 MB)
---------------------------------------- 0.0/8.1 MB ? eta -:--:--
----- ---------------------------------- 1.0/8.1 MB 7.2 MB/s eta
0:00:01
------------ --------------------------- 2.6/8.1 MB 7.2 MB/s eta
0:00:01
------------------- -------------------- 3.9/8.1 MB 7.1 MB/s eta
0:00:01
---------------------- ----------------- 4.5/8.1 MB 6.7 MB/s eta
0:00:01
------------------------------- -------- 6.3/8.1 MB 6.4 MB/s eta
0:00:01
--------------------------------- ------ 6.8/8.1 MB 5.6 MB/s eta
0:00:01
------------------------------------- -- 7.6/8.1 MB 5.4 MB/s eta
0:00:01
---------------------------------------- 8.1/8.1 MB 5.2 MB/s eta
0:00:00
Downloading contourpy-1.3.3-cp311-cp311-win_amd64.whl (225 kB)
Downloading cycler-0.12.1-py3-none-any.whl (8.3 kB)
Downloading fonttools-4.59.1-cp311-cp311-win_amd64.whl (2.3 MB)
---------------------------------------- 0.0/2.3 MB ? eta -:--:--
-------------------------------- ------- 1.8/2.3 MB 9.1 MB/s eta
0:00:01
---------------------------------------- 2.3/2.3 MB 8.5 MB/s eta
0:00:00
Downloading kiwisolver-1.4.9-cp311-cp311-win_amd64.whl (73 kB)
Downloading pillow-11.3.0-cp311-cp311-win_amd64.whl (7.0 MB)
---------------------------------------- 0.0/7.0 MB ? eta -:--:--
--------- ------------------------------ 1.6/7.0 MB 10.5 MB/s eta
0:00:01
---------------- ----------------------- 2.9/7.0 MB 8.8 MB/s eta
0:00:01
------------------ --------------------- 3.1/7.0 MB 4.7 MB/s eta
0:00:01
---------------------------- ----------- 5.0/7.0 MB 5.7 MB/s eta
0:00:01
------------------------------------ --- 6.3/7.0 MB 5.8 MB/s eta
0:00:01
---------------------------------------- 7.0/7.0 MB 5.7 MB/s eta
0:00:00
Downloading pyparsing-3.2.3-py3-none-any.whl (111 kB)
Installing collected packages: pyparsing, pillow, kiwisolver,
fonttools, cycler, contourpy, matplotlib
Successfully installed contourpy-1.3.3 cycler-0.12.1 fonttools-4.59.1
kiwisolver-1.4.9 matplotlib-3.10.5 pillow-11.3.0 pyparsing-3.2.3
[notice] A new release of pip is available: 25.0.1 -> 25.2
[notice] To update, run: python.exe -m pip install --upgrade pip
Random Forest Regression Model for Energy
Consumption Prediction
import joblib
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import r2_score, mean_squared_error,
mean_absolute_error
import matplotlib.pyplot as plt
# Load dataset
data = pd.read_csv('D:\Project Stage-II\Tetuan City Power
Consumption.csv')
# Convert DateTime to numerical format
data["DateTime"] = pd.to_datetime(data["DateTime"], format="%m/%d/%Y
%H:%M")
data["DateTime"] = data["DateTime"].values.astype(float)
# Separate features and target (predicting Zone 1 Power Consumption)
X = data.drop(columns=["Zone 1 Power Consumption", "Zone 2 Power
Consumption", "Zone 3 Power Consumption"])
y = data["Zone 1 Power Consumption"]
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.1, random_state=42)
# Initialize the random forest regressor
rf_model = RandomForestRegressor(n_estimators=100, random_state=42)
# Train the model
rf_model.fit(X_train, y_train)
# Predictions on test data
y_pred = rf_model.predict(X_test)
# Save the model
joblib.dump(rf_model, "rf_model.joblib")
# Evaluate model
r2 = r2_score(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
print("R^2 score:", r2)
print("Mean Squared Error (MSE):", mse)
print("Mean Absolute Error (MAE):", mae)
# Random Forest Algorithm (visualization)
# Extract features for predictions
X_new = data[["DateTime","Temperature", "Humidity", "Wind Speed",
"general diffuse flows", "diffuse flows"]]
# Make predictions on full dataset
y_pred = rf_model.predict(X_new)
# Plot predicted vs actual values
plt.figure(figsize=(10, 6))
plt.plot(y_test.values, label='Actual Values')
plt.plot(y_pred[:len(y_test)], label='Predicted Values')
plt.xlabel('Sample Index')
plt.ylabel('Power Consumption')
plt.title('Actual vs Predicted Power Consumption (Random Forest)')
plt.legend()
plt.show()
R^2 score: 0.9150239635644875
Mean Squared Error (MSE): 4243904.892595624
Mean Absolute Error (MAE): 1274.1475632488546
Interpretation of Results
1. R² Score tells how much variation in energy consumption is explained by the model.
– Closer to 1 = very good model.
– Closer to 0 = poor model.
2. MSE (Mean Squared Error): Average of squared errors. Smaller means predictions
are closer to actual.
3. MAE (Mean Absolute Error): Average absolute difference between predicted and
real values. Easy to understand in real units.
4. Graph:
– The blue line (Actual Values) shows real consumption.
– The orange line (Predicted Values) shows what the model estimated.
– If they are close, the model is doing well.
Model Accuracy = (R^2)*100 = 91.502%
prediction_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 288 entries, 0 to 287
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 DateTime 288 non-null datetime64[ns]
1 Temperature 288 non-null float32
2 Humidity 288 non-null float32
3 Wind Speed 288 non-null float32
4 general diffuse flows 288 non-null float32
5 diffuse flows 288 non-null float32
dtypes: datetime64[ns](1), float32(5)
memory usage: 8.0 KB
This code loads the trained Random Forest model that was saved
earlier.
pip install openpyxl #used to export DataFrames to Excel (.xlsx)
format
Collecting openpyxl
Downloading openpyxl-3.1.5-py2.py3-none-any.whl.metadata (2.5 kB)
Collecting et-xmlfile (from openpyxl)
Downloading et_xmlfile-2.0.0-py3-none-any.whl.metadata (2.7 kB)
Downloading openpyxl-3.1.5-py2.py3-none-any.whl (250 kB)
Downloading et_xmlfile-2.0.0-py3-none-any.whl (18 kB)
Installing collected packages: et-xmlfile, openpyxl
Successfully installed et-xmlfile-2.0.0 openpyxl-3.1.5
Note: you may need to restart the kernel to use updated packages.
[notice] A new release of pip is available: 25.0.1 -> 25.2
[notice] To update, run: python.exe -m pip install --upgrade pip
import pandas as pd
import joblib
import matplotlib.pyplot as plt
# Load the trained model
rf_model = joblib.load('rf_model.joblib')
a = prediction_df
a["DateTime"] = a["DateTime"].values.astype(float)
# Match the expected column names of the model with the predicted
dataset
# prediction_df = prediction_df[['Temperature', 'Humidity', 'Wind
Speed', 'general diffuse flows', 'diffuse flows']]
column_names = ['DateTime', 'Temperature', 'Humidity', 'Wind Speed',
'general diffuse flows', 'diffuse flows', 'Zone 1 Power Consumption']
new_data = pd.DataFrame(columns=column_names)
# Predict Zone 1 Power Consumption values
zone1_pred = rf_model.predict(a)
# Add predicted values to the new_data DataFrame
new_data['DateTime'] = a['DateTime']
new_data['Temperature'] = a['Temperature']
new_data['Humidity'] = a['Humidity']
new_data['Wind Speed'] = a['Wind Speed']
new_data['general diffuse flows'] = a['general diffuse flows']
new_data['diffuse flows'] = a['diffuse flows']
new_data['Zone 1 Power Consumption'] = zone1_pred
# Print predicted values
print(new_data)
# Save predicted values to Excel file
new_data.to_excel('predictionsFinal.xlsx', index=False)
# Show predicted values on a graph
plt.figure(figsize=(10, 6))
plt.plot(new_data['DateTime'], new_data['Zone 1 Power Consumption'],
label='Predicted Values')
plt.xlabel('DateTime')
plt.ylabel('Zone 1 Power Consumption')
plt.title('Zone 1 Predictions')
plt.legend()
plt.show()
DateTime Temperature Humidity Wind Speed general diffuse
flows \
0 1.514678e+18 6.424418 74.636497 0.078743 -
1.112825
1 1.514679e+18 6.296503 75.038246 0.076011 -
1.642341
2 1.514680e+18 6.163193 75.404900 0.073155 -
2.363314
3 1.514680e+18 6.025410 75.768738 0.070414 -
3.218098
4 1.514681e+18 5.882825 76.140709 0.067758 -
4.163715
.. ... ... ... ...
...
283 1.514848e+18 37.548275 99.034119 -1.371672
343.439362
284 1.514849e+18 37.522469 98.932373 -1.365601
343.339325
285 1.514849e+18 37.496876 98.831642 -1.359623
343.239685
286 1.514850e+18 37.471497 98.731911 -1.353733
343.140533
287 1.514851e+18 37.446320 98.633171 -1.347932
343.041809
diffuse flows Zone 1 Power Consumption
0 1.522756 29201.155893
1 3.253514 28949.620415
2 5.118820 28722.916766
3 7.186725 28715.965422
4 9.483515 28530.353255
.. ... ...
283 260.021179 36017.277435
284 259.739288 36017.277435
285 259.460968 36017.277435
286 259.186035 36017.277435
287 258.914459 36024.350001
[288 rows x 7 columns]
• It takes new input data (prediction_df) and predicts Zone 1 power consumption.
• A new table (new_data) is created, which contains:
– Input features (DateTime, Temperature, Humidity, Wind Speed, etc.)
– Predicted power consumption for Zone 1.
We are using the trained model to predict future electricity consumption for Zone 1 based on
weather conditions and other factors.
The results are stored in Excel and visualized in a line chart, so we can clearly see how power
consumption changes with time.
• 1.514678e+18 = Unix timestamp in nanoseconds
• Divide by 1e9 → you get seconds → normal Unix timestamp
• Convert → it becomes a standard DateTime: 2017-12-30
#X = X.astype({'DateTime': 'datetime64[ns]'})
X.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 52416 entries, 0 to 52415
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 DateTime 52416 non-null float64
1 Temperature 52416 non-null float64
2 Humidity 52416 non-null float64
3 Wind Speed 52416 non-null float64
4 general diffuse flows 52416 non-null float64
5 diffuse flows 52416 non-null float64
dtypes: float64(6)
memory usage: 2.4 MB
The following code takes date and time as input and tries to predict
weather/environmental values (temperature, humidity, wind, etc.) using separate
Random Forest models. Finally, it stores all the predictions in an Excel file.
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import joblib
# Separate independent variable and target variables
X = data[['DateTime']]
#X["DateTime"] = X["DateTime"].apply(lambda x: x.timestamp())
X = X.astype({'DateTime': 'datetime64[ns]'})
#X["DateTime"]=pd.to_datetime(X["DateTime"],format="%m/%d/%Y %H:%M")
#X["DateTime"] = pd.to_datetime(X["DateTime"])
y_temp = data['Temperature']
y_humidity = data['Humidity']
y_wind = data['Wind Speed']
y_general = data['general diffuse flows']
y_diffuse = data['diffuse flows']
# Create training and testing datasets
X_train, X_test, y_temp_train, y_temp_test = train_test_split(X,
y_temp, test_size=0.2, random_state=42)
X_train, X_test, y_humidity_train, y_humidity_test =
train_test_split(X, y_humidity, test_size=0.2, random_state=42)
X_train, X_test, y_wind_train, y_wind_test = train_test_split(X,
y_wind, test_size=0.2, random_state=42)
X_train, X_test, y_general_train, y_general_test = train_test_split(X,
y_general, test_size=0.2, random_state=42)
X_train, X_test, y_diffuse_train, y_diffuse_test = train_test_split(X,
y_diffuse, test_size=0.2, random_state=42)
# Create and train Random Forest Regression models
model_temp = RandomForestRegressor()
model_temp.fit(X_train, y_temp_train)
model_humidity = RandomForestRegressor()
model_humidity.fit(X_train, y_humidity_train)
model_wind = RandomForestRegressor()
model_wind.fit(X_train, y_wind_train)
model_general = RandomForestRegressor()
model_general.fit(X_train, y_general_train)
model_diffuse = RandomForestRegressor()
model_diffuse.fit(X_train, y_diffuse_train)
# Save the models
joblib.dump(model_temp, 'temp_model.joblib')
joblib.dump(model_humidity, 'humidity_model.joblib')
joblib.dump(model_wind, 'wind_model.joblib')
joblib.dump(model_general, 'general_model.joblib')
joblib.dump(model_diffuse, 'diffuse_model.joblib')
# Make predictions
y_temp_pred = model_temp.predict(X_test)
y_humidity_pred = model_humidity.predict(X_test)
y_wind_pred = model_wind.predict(X_test)
y_general_pred = model_general.predict(X_test)
y_diffuse_pred = model_diffuse.predict(X_test)
# Evaluate prediction performance
mse_temp = mean_squared_error(y_temp_test, y_temp_pred)
mse_humidity = mean_squared_error(y_humidity_test, y_humidity_pred)
mse_wind = mean_squared_error(y_wind_test, y_wind_pred)
mse_general = mean_squared_error(y_general_test, y_general_pred)
mse_diffuse = mean_squared_error(y_diffuse_test, y_diffuse_pred)
"""
print("Temperature Prediction Error (MSE):", mse_temp)
print("Humidity Prediction Error (MSE):", mse_humidity)
print("Wind Speed Prediction Error (MSE):", mse_wind)
print("General Diffuse Flows Prediction Error (MSE):", mse_general)
print("Diffuse Flows Prediction Error (MSE):", mse_diffuse)
"""
"""print("\nPredicted Temperature Values:")
print(y_temp_pred)
print("\nPredicted Humidity Values:")
print(y_humidity_pred)
print("\nPredicted Wind Speed Values:")
print(y_wind_pred)
print("\nPredicted General Diffuse Flow Values:")
print(y_general_pred)
print("\nPredicted Diffuse Flow Values:")
print(y_diffuse_pred)"""
predictions = pd.DataFrame({
"DateTime": X_test["DateTime"],
"Predicted Temperature": y_temp_pred,
"Predicted Humidity": y_humidity_pred,
"Predicted Wind Speed": y_wind_pred,
"Predicted General Diffuse Flows": y_general_pred,
"Predicted Diffuse Flows": y_diffuse_pred
})
# Write the DataFrame into an Excel file
predictions.to_excel("predictions4.xlsx", index=False)
This script is training five Random Forest Regression models to predict different
weather/environmental features based only on DateTime.
Given the weather and sunlight conditions at a certain time, how much
energy would Zone 1 use?
new_data
DateTime Temperature Humidity Wind Speed general diffuse
flows \
0 1.514678e+18 6.424418 74.636497 0.078743 -
1.112825
1 1.514679e+18 6.296503 75.038246 0.076011 -
1.642341
2 1.514680e+18 6.163193 75.404900 0.073155 -
2.363314
3 1.514680e+18 6.025410 75.768738 0.070414 -
3.218098
4 1.514681e+18 5.882825 76.140709 0.067758 -
4.163715
.. ... ... ... ...
...
283 1.514848e+18 37.548275 99.034119 -1.371672
343.439362
284 1.514849e+18 37.522469 98.932373 -1.365601
343.339325
285 1.514849e+18 37.496876 98.831642 -1.359623
343.239685
286 1.514850e+18 37.471497 98.731911 -1.353733
343.140533
287 1.514851e+18 37.446320 98.633171 -1.347932
343.041809
diffuse flows Zone 1 Power Consumption
0 1.522756 29201.155893
1 3.253514 28949.620415
2 5.118820 28722.916766
3 7.186725 28715.965422
4 9.483515 28530.353255
.. ... ...
283 260.021179 36017.277435
284 259.739288 36017.277435
285 259.460968 36017.277435
286 259.186035 36017.277435
287 258.914459 36024.350001
[288 rows x 7 columns]
import joblib
import pandas as pd
# Load the trained model
rf_model = joblib.load("rf_model.joblib")
# Prepare the new data for prediction (next week's data)
new_data = pd.DataFrame({
"DateTime": pd.date_range(start='2023-06-11', periods=7,
freq='D'),
"Temperature": [25.5, 26.1, 24.8, 25.3, 24.7, 23.9, 24.5],
"Humidity": [60, 55, 62, 57, 63, 58, 59],
"Wind Speed": [10, 12, 8, 11, 9, 10, 11],
"general diffuse flows": [80, 85, 90, 88, 86, 82, 84],
"diffuse flows": [100, 110, 95, 105, 98, 102, 99]
})
X_new=new_data
X_new["DateTime"] = pd.to_datetime(X_new["DateTime"]).astype(int) /
10**9
y_pred = rf_model.predict(X_new)
# Add the predicted values to the new_data DataFrame
new_data["Zone 1 Power Consumption"] = y_pred
# Print the predicted values for Zone 1 Power Consumption
print("Predicted Zone 1 Power Consumption Values:")
print(new_data["Zone 1 Power Consumption"])
new_data.to_excel("predictions.xlsx", index=False)
# Given the weather conditions on each day, how much power Zone 1 is
expected to consume according to your trained model.
Predicted Zone 1 Power Consumption Values:
0 40851.563225
1 41911.827830
2 39548.102861
3 41634.460795
4 40656.391253
5 41059.696665
6 40797.730651
Name: Zone 1 Power Consumption, dtype: float64
Linear Regression
# ------------------------------
# Linear Regression for Zone 1 Power Consumption
# ------------------------------
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score, mean_squared_error,
mean_absolute_error
# ------------------------------
# Load Dataset
# ------------------------------
data = pd.read_csv('D:\Project Stage-II\Tetuan City Power
Consumption.csv')
# Convert DateTime to datetime format
data["DateTime"] = pd.to_datetime(data["DateTime"], format="%m/%d/%Y
%H:%M")
# ------------------------------
# Features (X) and Target (y)
# ------------------------------
X = data.drop(columns=["Zone 1 Power Consumption","Zone 2 Power
Consumption","Zone 3 Power Consumption"])
y = data["Zone 1 Power Consumption"]
# Convert DateTime to numeric (seconds since epoch)
X["DateTime"] = X["DateTime"].astype('int64') // 10**9
# ------------------------------
# Split into Training and Testing sets
# ------------------------------
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2, random_state=42)
# ------------------------------
# Initialize and Train Linear Regression Model
# ------------------------------
lr_model = LinearRegression()
lr_model.fit(X_train, y_train)
# ------------------------------
# Make Predictions on Testing Data
# ------------------------------
y_pred = lr_model.predict(X_test)
# ------------------------------
# Evaluate Model
# ------------------------------
r2 = r2_score(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
print("R^2 Score:", r2)
print("Mean Squared Error (MSE):", mse)
print("Mean Absolute Error (MAE):", mae)
# ------------------------------
# Make Predictions on Full Dataset (New Data)
# ------------------------------
X_new = data[["DateTime","Temperature", "Humidity", "Wind Speed",
"general diffuse flows", "diffuse flows"]].copy()
# Ensure DateTime is numeric
X_new["DateTime"] = pd.to_datetime(X_new["DateTime"]).astype('int64')
// 10**9
# Predict Zone 1 Power Consumption
y_new_pred = lr_model.predict(X_new)
# Add predictions to the DataFrame
data["Predicted Zone 1 Power Consumption"] = y_new_pred
# ------------------------------
# Visualize Actual vs Predicted Values
# ------------------------------
plt.figure(figsize=(12,6))
plt.plot(y_test.values, label='Actual Values')
plt.plot(y_pred, label='Predicted Values')
plt.xlabel('Sample Index')
plt.ylabel('Zone 1 Power Consumption')
plt.title('Actual vs Predicted Values for Zone 1')
plt.legend()
plt.show()
# ------------------------------
# Save Predictions to Excel
# ------------------------------
data.to_excel('LinearRegression_Predictions.xlsx', index=False)
print("Predictions saved to LinearRegression_Predictions.xlsx")
R^2 Score: 0.22780164454656437
Mean Squared Error (MSE): 38991682.14209458
Mean Absolute Error (MAE): 5130.74156223554
Predictions saved to LinearRegression_Predictions.xlsx
Model Accuracy = (R^2)*100 = 22.780%
Support Vector Regression (SVR)
# Support Vector Regression (SVR) Model
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.svm import SVR
from sklearn.metrics import r2_score, mean_squared_error,
mean_absolute_error
# Load dataset
data = pd.read_csv('D:\Project Stage-II\Tetuan City Power
Consumption.csv')
# Convert DateTime to datetime type
data['DateTime'] = pd.to_datetime(data['DateTime'], format="%m/%d/%Y
%H:%M")
# Convert DateTime to numeric for ML
data['DateTime'] = data['DateTime'].astype(int) / 10**9
# Separate features and target
X = data[['DateTime', 'Temperature', 'Humidity', 'Wind Speed',
'general diffuse flows', 'diffuse flows']]
y = data['Zone 1 Power Consumption']
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2, random_state=42)
# Initialize and train the SVR model
svr_model = SVR()
svr_model.fit(X_train, y_train)
# Predict on the test set
y_pred = svr_model.predict(X_test)
# Evaluate the model
r2 = r2_score(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
print("R^2 score:", r2)
print("Mean Squared Error (MSE):", mse)
print("Mean Absolute Error (MAE):", mae)
# Predict on the entire dataset (or new dataset)
X_new = X.copy() # Using same features as training
y_pred_new = svr_model.predict(X_new)
# Add predictions to DataFrame
data['Predicted Zone 1 Power Consumption'] = y_pred_new
# Plot predicted vs actual
plt.figure(figsize=(10,6))
plt.plot(data['Zone 1 Power Consumption'], label='Actual Values')
plt.plot(data['Predicted Zone 1 Power Consumption'], label='Predicted
Values')
plt.xlabel('Index')
plt.ylabel('Zone 1 Power Consumption')
plt.title('Actual vs Predicted Zone 1 Power Consumption')
plt.legend()
plt.show()
R^2 score: -0.0003503761581964415
Mean Squared Error (MSE): 50512078.434796944
Mean Absolute Error (MAE): 5910.367951519604
Accuracy = -0.0003% (worst)
Decision Tree Model
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import r2_score, mean_squared_error,
mean_absolute_error
# Convert DateTime to numeric timestamp
data["DateTime"] = pd.to_datetime(data["DateTime"])
data["DateTime"] = data["DateTime"].astype(int) / 10**9 # Convert to
seconds since epoch
# Separate features and target
X = data[["DateTime","Temperature", "Humidity", "Wind Speed", "general
diffuse flows", "diffuse flows"]]
y = data["Zone 1 Power Consumption"]
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2, random_state=42)
# Initialize and train the Decision Tree model
dt_model = DecisionTreeRegressor()
dt_model.fit(X_train, y_train)
# Make predictions on testing data
y_pred_test = dt_model.predict(X_test)
# Evaluate model
r2 = r2_score(y_test, y_pred_test)
mse = mean_squared_error(y_test, y_pred_test)
mae = mean_absolute_error(y_test, y_pred_test)
print("R^2 score:", r2)
print("Mean Squared Error (MSE):", mse)
print("Mean Absolute Error (MAE):", mae)
# Predict on the entire dataset (or new data)
X_new = X.copy() # Ensure features match exactly
y_pred_new = dt_model.predict(X_new)
# Plot predicted vs actual
plt.figure(figsize=(10,6))
plt.plot(y, label='Actual Values')
plt.plot(y_pred_new, label='Predicted Values')
plt.xlabel('Sample Index')
plt.ylabel('Zone 1 Power Consumption')
plt.title('Actual vs Predicted Zone 1 Power Consumption')
plt.legend()
plt.show()
R^2 score: 0.7503346633529636
Mean Squared Error (MSE): 12606697.980759043
Mean Absolute Error (MAE): 1648.4178390452116
Model Accuracy = (R^2)*100 = 75.033%
Gradient Boosting Regressor
# Separate features and target
X = data.drop(columns=["Zone 1 Power Consumption","Zone 2 Power
Consumption","Zone 3 Power Consumption"])
y = data["Zone 1 Power Consumption"]
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2, random_state=42)
from sklearn.ensemble import GradientBoostingRegressor
# Initialize the gradient boosting regressor
gb_model = GradientBoostingRegressor()
# Train the model on the training data
gb_model.fit(X_train, y_train)
# Make predictions on the testing data
y_pred = gb_model.predict(X_test)
from sklearn.metrics import r2_score, mean_squared_error,
mean_absolute_error
# Calculate R^2 score
r2 = r2_score(y_test, y_pred)
# Calculate mean squared error (MSE)
mse = mean_squared_error(y_test, y_pred)
# Calculate mean absolute error (MAE)
mae = mean_absolute_error(y_test, y_pred)
print("R^2 score:", r2)
print("Mean Squared Error (MSE):", mse)
print("Mean Absolute Error (MAE):", mae)
R^2 score: 0.4177769575005359
Mean Squared Error (MSE): 29398995.282257207
Mean Absolute Error (MAE): 4230.622513487244
Model Accuracy = 41.77%
K-Nearest Neighbors (KNN) Regressor
import joblib
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsRegressor
from sklearn.metrics import r2_score, mean_squared_error,
mean_absolute_error
# Convert DateTime to numeric timestamp
data["DateTime"] = pd.to_datetime(data["DateTime"]).astype(int) /
10**9 # seconds since epoch
# Separate features and target
X = data.drop(columns=["Zone 1 Power Consumption","Zone 2 Power
Consumption","Zone 3 Power Consumption"])
y = data["Zone 1 Power Consumption"]
# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2, random_state=42)
# Initialize and train KNN
knn_model = KNeighborsRegressor()
knn_model.fit(X_train, y_train)
# Predict on test data
y_test_pred = knn_model.predict(X_test)
# Performance metrics
print("R^2 score:", r2_score(y_test, y_test_pred))
print("Mean Squared Error (MSE):", mean_squared_error(y_test,
y_test_pred))
print("Mean Absolute Error (MAE):", mean_absolute_error(y_test,
y_test_pred))
# Save the model
joblib.dump(knn_model, "knn_model.joblib")
# Prepare new data for prediction
X_new = X.copy() # Ensure same columns as training data
# Predict on new data
y_new_pred = knn_model.predict(X_new)
# Plot
plt.figure(figsize=(10, 6))
plt.plot(y_test.values, label='Actual Values')
plt.plot(y_test_pred, label='Predicted Values (Test Set)')
plt.xlabel('Sample Index')
plt.ylabel('Zone 1 Power Consumption')
plt.title('Actual vs Predicted Values')
plt.legend()
plt.show()
R^2 score: 0.40119520100207084
Mean Squared Error (MSE): 30236280.90217556
Mean Absolute Error (MAE): 3977.115679125906
Model Accuracy = 40.11%