Long-Term Energy Demand Analysis Using Machine Learning Algorithms A Case Study in Bangladesh
Long-Term Energy Demand Analysis Using Machine Learning Algorithms A Case Study in Bangladesh
Abstract—In each energy sector from production to the beginning of the Industrial Revolution. The rising of
consumption machine learning and statistical models have different non-renewable and renewable energies like solar,
intimidated the power system for the past few decades. Among hydro, harvesting sea waves, wind, etc. have led to complex
these, energy demand analysis is one of the approved research energy consumption patterns. Diverse energy consumption
projects in Bangladesh due to its necessity. Such a study introduces patterns are associated with different energy resources [3].
energy consumption forecasting along with necessary Factors like per capita income can affect annual energy
environmental featured variables. These energy demands vary consumption. Out of this energy consumption, in 2021, about 28
depending on short-term, medium-term, and long-term analysis.
petawatt-hours of electrical energy were generated globally [4].
Although, this paper utilizes long-term energy demand research
Subsequently, Total energy consumption is divided into
while predicting energy consumption. The dataset in the study
includes three years of monthly data consisting of more than one
different fuel types based on consumption rates demonstrated in
million samples and diverse consumer types with versatile features Fig. 1 [5].
of the capital of Bangladesh, Dhaka (mainly Uttara). Different When electricity production exceeds electricity demand,
machine learning models like; Random Forest, k-neighrest several problems arise. Firstly, the overproduction of electricity
Neighbors regression, Extreme Gradient Boosting Method, and can lead to economic waste since the costs of generating power
Light Gradient Boosting Method have been used and in order to are incurred without corresponding customer revenue.
evaluate them various performance constraints have been utilized. Secondly, operating power plants, especially those that need to
Among the models, KNN has performed considerably well. It has ramp up and down frequently to balance supply and demand,
the result of about 0.9447 as 𝑹𝑹𝟐𝟐 , 163.9 as RMSE, and 28.7 as 𝑴𝑴𝑴𝑴𝑴𝑴. can cause additional wear and tear on equipment. This can lead
Such study will surely subsidize upgrading to the power system to higher maintenance costs and potentially shorter lifespans for
management of Bangladesh. power generation equipment. Besides, power plants running on
fossil fuels might continue to produce power even when it is not
Keywords—demand-side management energy consumption,
needed, leading to wasted resources. Thus, comes the necessity
energy demand, energy management policy, long-term prediction,
machine learning models
of demand-side energy management. Developing an effective
power management plan requires a thorough study of energy
I. INTRODUCTION consumption. Astoundingly, machine learning (ML) is crucial
in demand-side management (DSM). It helps to optimize
Energy consumption and production worldwide play a electricity use and balance it with supply. Meanwhile, DSM is
significant role in numerous sustainability solutions, such as designed to encourage people to change how much and when
addressing climate change and promoting resource preservation. they use electricity, especially during high demand.
Historically, industrialized nations have been the primary
consumers of energy [1]. Nevertheless, this scenario is presently Contrariwise, as consumer preferences tend to advance over
evolving. Developing countries, spurred by industrialization, time, this creates uncertainties in daily energy consumption
enhanced living standards, and population expansion, are patterns [6]. In the last decade, worldwide energy demand has
experiencing a swift rise in energy usage. Consequently, the increased significantly. The rapid increase in global population
current estimation of worldwide energy consumption is about coupled with industrialization, economic growth, rising comfort
580 million terajoules which assesses the necessity of demands, and social progress has greatly affected global energy
understanding what factors lead to this massive energy consumption and environmental issues [7].
consumption [2]. Energy consumption patterns changed from
Authorized licensed use limited to: North South University. Downloaded on March 15,2025 at 06:06:57 UTC from IEEE Xplore. Restrictions apply.
comparison among them has been signified in section IV. It also 𝑦𝑦 ← 𝑓𝑓̂(𝑥𝑥) ≝ ∑𝐵𝐵𝑏𝑏=1 𝑓𝑓𝑏𝑏 (𝑥𝑥)
1
(8)
establishes the final discussion and results regarding the 𝐵𝐵
practical data. Finally, it dissolves with featured valuation and However, overfitting is the challenge of such an ensemble
observations followed by conclusion in section V. model. This RF model has the hyperparameter setup as:
n_estimators=20, criterion=“poisson”, random_state=0,
II. MODELING OF MACHINE LEARNING ALGORITHMS n_jobs=-1 (others default).
Prediction with machine learning models has carried certainty C. EXTREME GRADIENT BOOSTING (XGBOOST)
to each corner of the world’s application. Excelling to that many METHOD
novel algorithms are present today for estimating any probable Using the level-wise tree growth method, the ensemble
future outcome apart from the conventional ones. Some machine extreme gradient boosting model conducts L1 (lasso) and L2
learning models used in this paper have been described below. (ridge) regularization processes. The model has the following
objective:
A. K-NEAREST NEIGHBORS (KNN) REGRESSION
(𝑡𝑡−1)
K-nearest neighbors as a regression model use different 𝐿𝐿(𝑡𝑡) = ∑𝑛𝑛𝑖𝑖=0 𝑙𝑙 �𝑦𝑦𝑖𝑖 , 𝑦𝑦�𝚤𝚤 + 𝑓𝑓𝑡𝑡 (𝑥𝑥𝑖𝑖 )� + 𝛺𝛺(𝑓𝑓𝑡𝑡 ) (9)
distance measurement techniques to best fit the algorithm. -where, 𝑦𝑦𝑖𝑖 = actual observation, 𝑦𝑦�𝚤𝚤 = estimated
(𝑡𝑡−1)
Among the distance measurement methods, Minkowski
observation, 𝑙𝑙 = function of CART (classification and
distance, Euclidean distance, Manhattan distance, Chebyshev
distance, and Cosine similarity are the popular ones. Using these regression tree) learner (summation of 𝑡𝑡th and (𝑡𝑡 + 1)th trees),
methods, the model identifies the possible ‘k’ numbers of future 𝑡𝑡 = iteration, 𝐿𝐿 = loss function, 𝛺𝛺(𝑓𝑓𝑡𝑡 ) = regularization. Another
target features. In this paper, the hyperparameter setup used: way to express a gradient boosting model:
1
n_jobs=-1, n_neighbors=5 (others default). Thus, few of the 𝑓𝑓 = 𝑓𝑓0 (𝑥𝑥) ≝ ∑𝑁𝑁 𝑦𝑦 (10)
𝑁𝑁 𝑖𝑖=1 𝑖𝑖
distance measurement techniques are: 𝑦𝑦�𝑖𝑖 ← 𝑦𝑦𝑖𝑖 − 𝑓𝑓(𝑥𝑥𝑖𝑖 ) (11)
Manhattan: (𝑝𝑝 = 1) 𝑓𝑓 ≝ 𝑓𝑓0 + 𝛼𝛼𝑓𝑓1 (12)
(𝑗𝑗) (𝑗𝑗) -where 𝑦𝑦�𝑖𝑖 is the residual and 𝛼𝛼 is known as the hyperparameter
𝑑𝑑(𝑥𝑥𝑖𝑖 , 𝑥𝑥𝑘𝑘 ) ≝ ∑𝐷𝐷
𝑗𝑗=1�𝑥𝑥𝑖𝑖 – 𝑥𝑥𝑘𝑘 � (1)
or learning rate for the boosting model and new decision tree 𝑓𝑓1
Euclidean: (𝑝𝑝 = 2) [17]. In this paper, the hyperparameter setup used:
(𝑗𝑗) 2
1
𝑑𝑑(𝑥𝑥𝑖𝑖 , 𝑥𝑥𝑘𝑘 ) ≝ (∑𝐷𝐷
𝑗𝑗=1�𝑥𝑥𝑖𝑖
(𝑗𝑗)
– 𝑥𝑥𝑘𝑘 � )2 (2) n_estimators=30, max_depth=5, eta=0.1, subsample=1,
Cosine: colsample_bytree=1 (others default).
(𝑗𝑗) (𝑗𝑗)
∑𝐷𝐷
𝑗𝑗=1 𝑥𝑥𝑖𝑖 𝑥𝑥𝑘𝑘 D. LIGHT GRADIENT BOOSTING (LIGHT-GBM) METHOD
𝑠𝑠(𝑥𝑥𝑖𝑖 , 𝑥𝑥𝑘𝑘 ) ≝ cos�∠(𝑥𝑥𝑖𝑖 , 𝑥𝑥𝑘𝑘 )� = (3)
�∑𝐷𝐷
(𝑗𝑗) 2 𝐷𝐷 (𝑗𝑗) 2
𝑗𝑗=1(𝑥𝑥𝑖𝑖 ) �∑𝑗𝑗=1(𝑥𝑥𝑘𝑘 )
Stochastic gradient boosting has been used for estimation
with the hyperparameter method in the light gradient boosting
B. RANDOM FOREST (RF) ensemble algorithm. Nevertheless, the model has greater
Being an ensemble ML algorithm, random forest (RF) can training speed while learning and so developed outcomes, as a
randomly sample the entire data into multiple decision trees result of gaining the skill to select features by automatic large
based on different optimization methods. This algorithm gradient boosting.
1
optimizer has the nonparametric model as: 𝐿𝐿 = ∑𝑛𝑛𝑖𝑖=0(𝑦𝑦𝑖𝑖 − 𝛾𝛾𝑖𝑖 )2 (13)
𝑛𝑛
𝑓𝑓𝐼𝐼𝐼𝐼3 (𝑥𝑥) ≝ Pr (𝑦𝑦 = 1|𝑥𝑥) (4) where 𝐿𝐿 = the loss function, 𝛾𝛾𝑖𝑖 = predictive observation of the
𝑖𝑖th sample, 𝑦𝑦𝑖𝑖 = actual observation of the 𝑖𝑖th sample, 𝑛𝑛 = the
𝑆𝑆 ≝ {(𝑥𝑥𝑖𝑖 , 𝑦𝑦𝑖𝑖 )}𝑁𝑁
𝑖𝑖=1 (5) sample numbers. Depending on 𝐿𝐿, the algorithm conducts to
𝑆𝑆
𝑓𝑓𝐼𝐼𝐼𝐼3
1
≝ |𝑆𝑆| ∑(𝑥𝑥,𝑦𝑦)∈𝑆𝑆 𝑦𝑦 (6) minimize the error in each step as well as computes associated
residuals by the following formulas [17] :
-where 𝑆𝑆 is known as set of labeled examples and (6) denotes 𝑑𝑑𝑑𝑑
= −(𝑦𝑦𝑖𝑖 − 𝛾𝛾𝑖𝑖 ) = −(𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 − 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝) (14)
the constant model where the sampling begins with (5) [17] . In 𝑑𝑑𝑑𝑑
order to determine the leaf node, the set of examples is divided 𝜕𝜕𝜕𝜕�𝑦𝑦𝑖𝑖 ,𝐹𝐹(𝑥𝑥𝑖𝑖 )�
𝑟𝑟𝑖𝑖𝑖𝑖 = − � � 𝑓𝑓𝑓𝑓𝑓𝑓 𝑖𝑖 = 1,2, … , 𝑛𝑛.(15)
with pieces of 𝑆𝑆− and 𝑆𝑆+ , so that all the features j=1, 2, 3, …, d 𝜕𝜕𝜕𝜕(𝑥𝑥𝑖𝑖 ) 𝐹𝐹(𝑥𝑥)=𝐹𝐹𝑚𝑚−1 (𝑥𝑥)
and even the threshold 𝑡𝑡 would have the entropy 𝐻𝐻(𝑆𝑆− , 𝑆𝑆+ ) as
follows: 𝛾𝛾𝑚𝑚 = arg 𝛾𝛾 𝑚𝑚𝑚𝑚𝑚𝑚 ∑𝑛𝑛𝑖𝑖=0 𝐿𝐿�𝑦𝑦𝑖𝑖 , 𝐹𝐹𝑚𝑚−1 (𝑥𝑥𝑖𝑖 ) + 𝛾𝛾ℎ𝑚𝑚 (𝑥𝑥𝑖𝑖 )� (16)
|𝑆𝑆 | |𝑆𝑆 |
𝐻𝐻(𝑆𝑆− , 𝑆𝑆+ ) ≝ − 𝐻𝐻(𝑆𝑆− ) + + 𝐻𝐻(𝑆𝑆+ ) (7) 𝐹𝐹𝑚𝑚 (𝑥𝑥) = 𝐹𝐹𝑚𝑚−1 (𝑥𝑥) + 𝑣𝑣𝑚𝑚 ℎ𝑚𝑚 (𝑥𝑥) (17)
𝑆𝑆 𝑆𝑆
As the algorithm uses multiple trees to train the model, it -where, 𝑟𝑟𝑖𝑖𝑖𝑖 = pseudo residual, 𝐹𝐹(𝑥𝑥𝑖𝑖 ) = previous model, 𝑚𝑚 =
number of decision trees, ℎ𝑚𝑚 (𝑥𝑥𝑖𝑖 ) or ℎ𝑚𝑚 (𝑥𝑥) = decision tree
basically uses either the bagging or boosting method to sample,
made on residuals, 𝐹𝐹𝑚𝑚 (𝑥𝑥) = new prediction, 𝐹𝐹𝑚𝑚−1 (𝑥𝑥) =
but in this paper, the model has used a bagging algorithm previous prediction, 𝑣𝑣𝑚𝑚 = learning rate [0-1]. Furthermore,
(sampling with replacement (𝑆𝑆𝑏𝑏 )) [17]. This sampling continues light-GBM precisely models the regression by updating each
till 𝑆𝑆𝑏𝑏 = 𝑁𝑁. Now, if the model has created 𝐵𝐵 random samples decision with respect to the previous loss function and residuals
then the number of decision trees would also be 𝐵𝐵. Thus, for it has encountered. The hyperparameter setup of the model used
any future estimation of a new example of 𝑥𝑥: in the paper: num_leaves=5, n_estimators=20,
Authorized licensed use limited to: North South University. Downloaded on March 15,2025 at 06:06:57 UTC from IEEE Xplore. Restrictions apply.
subsample_for_bin=10000, random_state=42, n_jobs=-1 After the comprehensive data processing, the entire data is then
(others default). lead to the scaling process based on the ML models. In this
modeling, Random Forest (RF), k-neighrest Neighbors (KNN)
III. DATA PROPERTIES AND ESTIMATION regression, Extreme Gradient Boosting Method (XGBoost), and
For the modeling purpose the software that has been utilized Light Gradient Boosting Method (Light-GBM) have been used.
is Jupyter notebook with a core i-9 14-th gen intel processor, Afterward, in the learning process, the data has been divided into
1TB SSD storage, 8GB NVIDIA GeForce RTX 4070 graphics, an 80:20 ratio as training:testing dataset as shown in Fig. 2. The
and 32GB RAM personal computer. Moreover, the data has training set has to be trained and optimized with certain
been gathered from Dhaka Electric Supply Company Limited parameters according to the output requirement. Using both the
(DESCO). The dataset mainly consists of ten features including trained model and the testing set parameters certain performance
the target variable which is the energy consumption in kWh. The parameters have been displayed which are: coefficient of
other features are SND (location name in Dhaka), determination (𝑅𝑅2 ), mean absolute error (𝑀𝑀𝑀𝑀𝑀𝑀), mean square
ACCOUNT_NO (electricity meter card number), error ( 𝑀𝑀𝑀𝑀𝑀𝑀 ), and root mean square error ( 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 ). The
MONTHLY_SPENT_MONEY (BDT), TARIFF (35 types of mathematical representations of such parameters are:
consumers), YEAR [2021-2023], MONTH [1-12], 2
∑𝑁𝑁 � 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 �
𝑖𝑖=1�𝑦𝑦𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝,𝑖𝑖 – 𝑦𝑦
RELATIVE_HUMIDITY (%), AVERAGE_TEMPARATURE 𝑅𝑅2 = [ 2 × 100] % (18)
(℃), MAXIMUM_TEMPARATURE (℃). After merging all ∑𝑁𝑁 � 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 �
𝑖𝑖=1�𝑦𝑦𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑,𝑖𝑖 – 𝑦𝑦
the local data together, the total sample of the dataset has been 2
∑𝑁𝑁
𝑖𝑖=1�𝑦𝑦𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝,𝑖𝑖 – 𝑦𝑦𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑,𝑖𝑖 �
stated as around 1.8 million. 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 = � (19)
𝑁𝑁
1
𝑀𝑀𝑀𝑀𝑀𝑀 = ∑𝑁𝑁
𝑖𝑖=1 |𝑦𝑦𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑,𝑖𝑖 – 𝑦𝑦𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝,𝑖𝑖 | (20)
𝑁𝑁
1 2
𝑀𝑀𝑀𝑀𝑀𝑀 = ∑𝑁𝑁
𝑖𝑖=1�𝑦𝑦𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝,𝑖𝑖 – 𝑦𝑦𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑,𝑖𝑖 � (21)
𝑁𝑁
Authorized licensed use limited to: North South University. Downloaded on March 15,2025 at 06:06:57 UTC from IEEE Xplore. Restrictions apply.
mostly prominent in the modeling era. Likewise, XGBoost and
Light-GBM are not lagging either XGBoost has got around
87% accuracy in the prediction part where the error is stated as
37. Again, similar to the boosting part to XGBoost, Light-GBM
has achieved 81% accuracy. On the contrary, as RF has the
overfitting issue and as for the complexity pattern it has
performed poorly with only around 25% accuracy with 8.5
𝑀𝑀𝑀𝑀𝑀𝑀.
(d) Light-GBM
Fig 3. Accuracy graph of ML models [(a) RF, (b) KNN, (c) XGBoost, and
(d) Light-XGB].
V. CONCLUSION
This paper has presented a proposed model of forecasting
energy demand on a long-term basis with comparison among
different machine learning algorithms like Random Forest (RF),
(a) RF k-neighrest Neighbors (KNN) regression, Extreme Gradient
Boosting Method (XGBoost), and Light Gradient Boosting
Method (Light-GBM). As KNN has the tremendous ability to
recognize sequences from earlier patterns while handling a
sizeable amount of data, it has the highest accuracy rate of 94%
with approximately 28.7 error rates. Likewise, Light-GBM and
XGBoost have some perks in them as well. With excelling
scaling properties both these models have shown 81% and 87%
accuracy respectively; whereas in the computational
perspective, XGBoost has quicker computational time.
Contrarily, RF has displayed 25% accuracy with meager error
rates. Such tradeoff can absolutely be utilized in any power
sector application-based platform with this modeling.
Especially, a developing country like Bangladesh which is
currently facing power issues, can consider such a study in
practical use so that there could be proper power management
(b) KNN policy stability. Hence, the future aim of the paper is to introduce
hybrid models and state comparisons with the existing ones
along with big data.
Authorized licensed use limited to: North South University. Downloaded on March 15,2025 at 06:06:57 UTC from IEEE Xplore. Restrictions apply.
REFERENCES [10] S. Taheri, M. Jooshaki, and M. Moeini-Aghtaie, “Long-term planning of
integrated local energy systems using deep learning algorithms,”
[1] B. van Ruijven et al., “Modeling Energy and Development: An Evaluation International Journal of Electrical Power & Energy Systems, vol. 129, p.
of Models and Concepts,” World Dev, vol. 36, no. 12, pp. 2801–2821, 106855, Jul. 2021, doi: 10.1016/j.ijepes.2021.106855.
Dec. 2008, doi: 10.1016/j.worlddev.2008.01.011.
[11] N. G. Paterakis, E. Mocanu, M. Gibescu, B. Stappers, and W. van Alst,
[2] “Global energy consumption : “Deep learning versus traditional machine learning methods for
https://www.theworldcounts.com/challenges/climate aggregated energy demand prediction,” in 2017 IEEE PES Innovative
change/energy/global-energy-consumption.” Smart Grid Technologies Conference Europe (ISGT-Europe), IEEE, Sep.
[3] Hannah Ritchie, Pablo Rosado, and Max Roser, “Energy Production and 2017, pp. 1–6. doi: 10.1109/ISGTEurope.2017.8260289.
Consumption : https://ourworldindata.org/energy-production- [12] T. Vantuch, A. G. Vidal, A. P. Ramallo-Gonzalez, A. F. Skarmeta, and S.
consumption.,” OurWorldInData.org. (Accessed: 25 June 2024) Misak, “Machine learning based electric load forecasting for short and
[4] “List of countries by electricity production (2024) : long-term period,” in 2018 IEEE 4th World Forum on Internet of Things
https://en.wikipedia.org/wiki/World energy supply and consumption.” (WF-IoT), IEEE, Feb. 2018, pp. 511–516. doi: 10.1109/WF-
(Accessed: 25 June 2024) IoT.2018.8355123.
[5] “World Energy Supply and consumption (2024a) : [13] T. Ahmad and H. Chen, “Potential of three variant machine-learning
https://en.wikipedia.org/wiki/World energy supply and consumption.”. models for forecasting district level medium-term and long-term energy
(Accessed: 25 June 2024) demand in smart grid environment,” Energy, vol. 160, pp. 1008–1020,
[6] N. Al Khafaf, M. Jalili, and P. Sokolowski, “Application of Deep Oct. 2018, doi: 10.1016/j.energy.2018.07.084.
Learning Long Short-Term Memory in Energy Demand Forecasting,” [14] N. Somu, G. R. M R, and K. Ramamritham, “A hybrid model for building
2019, pp. 31–42. doi: 10.1007/978-3-030-20257-6_3. energy consumption forecasting using long short term memory
[7] N. Somu, G. R. M R, and K. Ramamritham, “A hybrid model for building networks,” Appl Energy, vol. 261, p. 114131, Mar. 2020, doi:
energy consumption forecasting using long short term memory 10.1016/j.apenergy.2019.114131.
networks,” Appl Energy, vol. 261, p. 114131, Mar. 2020, doi: [15] N. Somu, G. Raman M R, and K. Ramamritham, “A deep learning
10.1016/j.apenergy.2019.114131. framework for building energy consumption forecast,” Renewable and
[8] Mystakidis, P. Koukaras, N. Tsalikidis, D. Ioannidis, and C. Tjortjis, Sustainable Energy Reviews, vol. 137, p. 110591, Mar. 2021, doi:
“Energy Forecasting: A Comprehensive Review of Techniques and 10.1016/j.rser.2020.110591.
Technologies,” Energies (Basel), vol. 17, no. 7, p. 1662, Mar. 2024, doi: [16] Ghazal and T.M., Energy demand forecasting using fused machine
10.3390/en17071662. learning approaches, 31(1). Intelligent Automation & Soft Computing,
[9] D. Solyali, “A Comparative Analysis of Machine Learning Approaches 2022.
for Short-/Long-Term Electricity Load Forecasting in Cyprus,” [17] Burkov, The Hundred-Page Machine Learning Book, 1st ed. USA:
Sustainability, vol. 12, no. 9, p. 3612, Apr. 2020, doi: Seattle, Washington, 2019.
10.3390/su12093612.
Authorized licensed use limited to: North South University. Downloaded on March 15,2025 at 06:06:57 UTC from IEEE Xplore. Restrictions apply.