I Eee 1739263763732
I Eee 1739263763732
This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2025.3537111
Abstract—Integrating advanced artificial intelligence (AI), the these issues is a critical area of research and innovation. Ac-
Internet of Things (IoT) and cutting-edge cloud computing epit- tivities are increasingly aimed at developing energy-efficient
omises the transformative potential of Industry 5.0 technologies, solutions to reduce the ecological footprint of increasing
enabling unprecedented automation and efficiency. However, this
technological surge also brings serious environmental challenges, data centre operations. Under the Sustainable Development
significantly increasing energy consumption and carbon emis- Agenda1 , the United Nations has noted it as very important
sions. This paper introduces EcoCloud, a robust task scheduling because of the potential of intelligent IoT solutions to reduce
mechanism based on Ant Colony Optimisation (ACO) principles energy consumption considerably. For instance, using AI to
that aims to improve energy and carbon efficiency in Industrial predict energy demands and align task execution with the
IoT-enabled cloud environments. EcoCloud dynamically sched-
ules MapReduce jobs on Hadoop clusters in IoT-based cloud availability of low-carbon energy sources at that specific time
systems by leveraging real-time resource consumption metrics optimises resource usage and minimises carbon footprints [6].
through a comprehensive energy model deployed via a Multi- The way forward with Google’s carbon-neutral data centres
Layer Perceptron (MLP) neural network. As a result, the model and forerunning companies is working on carbon-intelligent
accurately predicts power consumption and distributes workloads computing, thereby shifting computational loads to places
to underutilised nodes to optimise energy usage and reduce
carbon emissions. Extensive evaluations show that EcoCloud sig- where energy from low-carbon sources is available in real-
nificantly outperforms traditional scheduling methods, improving time and location-based terms [7]. These novel strategies can
energy consumption and overall system performance. be integrated into the IIoT to maintain an extremely sensitive
Index Terms—Carbon intelligent computing, Big data, Cloud equilibrium between high performance and environmental
computing, Energy efficiency, Task scheduling. sustainability to create greener and more efficient industrial
ecosystems. Fig. 1 illustrates the energy-efficient processing
framework for IoT data within cloud environments. Data
I. I NTRODUCTION
flows from IoT devices to the cloud-based big data cluster
Authorized licensed use limited to: VTU Consortium. Downloaded on February 11,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
© 2025 IEEE. Personal use is permitted, but republication/redistribution requires IEEE [Link] [Link] for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2025.3537111
Many works adopt strategies to enhance energy efficiency strengths of Genetic Algorithms (GA) and ACO to enhance
and task scheduling in cloud computing environments. Tziritas task scheduling in cloud computing environments. The algo-
et al. [8] propose an application-aware workload consolidation rithm leverages GA for its global search capability to quickly
framework to minimise energy consumption and network find an optimal solution, which is then converted into the initial
load in cloud environments. The technique leverages a multi- pheromone for ACO to refine and achieve high accuracy in
objective optimisation model to reduce both energy consump- task scheduling. Although this hybrid approach demonstrates
tion and network load by strategically placing workloads in a improved performance over standalone GA and ACO, it still
manner that optimises resource use. However, the framework faces scalability and real-time adaptability limitations, as the
primarily focuses on static workloads and does not incorporate algorithm does not incorporate real-time monitoring or pre-
dynamic adaptability to changing resource utilisation in real- dictive analytics. The authors of [14] present an improved
time, which limits its effectiveness in highly variable and dy- ACO-based job scheduling mechanism for Hadoop clusters,
namic workloads, where resource demands can change rapidly introducing an aggregator node to enhance the job scheduling
and unpredictably. The authors of [9] investigate applying capabilities and optimise resource allocation. This approach
deep learning techniques to develop an energy-efficient task modifies the default Hadoop Distributed File System (HDFS)
scheduling algorithm to predict resource usage and optimize architecture by adding aggregator nodes that assign jobs to
task allocation within cloud computing environments. The data nodes. At the same time, an improvised ACO method
DNN model is trained on historical data to understand and schedules jobs based on job size and expected execution time.
forecast resource demands, facilitating more energy-efficient Despite its effectiveness in improving Hadoop’s performance,
task scheduling. While this approach shows potential for the method primarily focuses on static job characteristics and
significant energy savings, it is limited by the computational lacks dynamic adaptability to real-time changes in resource
complexity of training and deploying deep learning models utilisation. For example, Jeyaraj and Paul [15] explore an
that require substantial computational resources and time for ACO-based optimisation strategy for scheduling MapReduce
training, which can be prohibitive in large-scale cloud envi- tasks in virtualised heterogeneous environments. The algo-
ronments where rapid and efficient decision-making is crucial. rithm aims to improve task scheduling efficiency by con-
Hussain et al. [10] introduce the Energy and Performance- sidering the heterogeneity of virtualised resources and opti-
Efficient Task Scheduling Algorithm (EPETS), specifically mising resource allocation. Although the approach effectively
designed for heterogeneous virtualised cloud environments. addresses resource heterogeneity and enhances scheduling
EPETS operates in two stages: an initial scheduling phase to efficiency, it does not incorporate real-time monitoring and
reduce execution time and a subsequent reassignment phase prediction model to adapt to dynamic changes in resource
that optimises energy consumption while ensuring tasks meet demands, which limits its ability to optimise energy usage
their deadlines. This two-stage approach effectively balances and reduce carbon footprints in dynamic cloud environments.
performance and energy efficiency. However, the algorithm Deng et al. [16] propose an improved algorithm for energy-
primarily addresses static task environments. It lacks a dy- aware task scheduling on heterogeneous multiprocessor sys-
namic scheduling mechanism that can adapt to real-time tems. The algorithm enhances task assignment and scheduling
changes in resource utilisation, which makes EPETS unsuit- sequence to minimise energy costs, with results demonstrat-
able for cloud environments where workload demands are ing superior efficiency over existing methods. However, the
highly variable and require constant adjustments to maintain algorithm relies on static scheduling parameters and lacks
optimal performance and energy efficiency. The authors in [11] real-time adaptability, making it less effective in environments
present a set of energy-aware resource allocation heuristics with highly variable workloads and rapidly changing real-time
aimed at improving the energy efficiency of data centres in dynamic resource demands.
cloud computing environments, which focuses on optimising Many techniques focus on optimising energy consumption
the allocation of resources such as CPU, memory, and storage without fully integrating carbon-intelligent strategies to re-
to reduce overall energy consumption. They aim to minimise duce their potential for maximising environmental benefits.
energy waste and improve data centre efficiency by strategi- Additionally, static or semi-static resource allocation methods
cally managing these resources. Despite their effectiveness, do not adapt efficiently to real-time resource utilisation and
these heuristics face challenges in dealing with modern cloud energy supply fluctuations, leading to suboptimal energy effi-
applications’ dynamic and fluctuating demands. As workloads ciency and increased carbon footprints. While deep learning
and resource needs change rapidly, the static nature of these techniques show promise, their application in task scheduling
heuristics is prone to struggle to adapt, leading to inefficiencies within cloud environments is underexplored, particularly in
and higher energy consumption in cloud environments. integrating real-time monitoring data. Similarly, ACO, widely
ACO, developed by [12], is a metaheuristic algorithm used for MapReduce task scheduling, usually lacks dynamic
inspired by the foraging behaviour of ants, which find the adaptability and real-time monitoring integration, limiting its
shortest paths between their colony and food sources by efficiency in handling fluctuating workloads and optimising
depositing and following pheromone trails. Many works adopt energy use. Therefore, a comprehensive, dynamic, and in-
ACO for MapReduce task scheduling due to its effectiveness telligent task scheduling solution integrating real-time mon-
in optimizing resource allocation and improving system perfor- itoring, deep learning predictions, and adaptive optimisation
mance in distributed computing environments. Liu et al. [13] techniques is needed to enhance energy efficiency and reduce
propose a hybrid task scheduling algorithm that combines the carbon footprints in cloud computing environments.
Authorized licensed use limited to: VTU Consortium. Downloaded on February 11,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
© 2025 IEEE. Personal use is permitted, but republication/redistribution requires IEEE [Link] [Link] for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2025.3537111
In light of these limitations and challenges, we investigate distributed nodes using lightweight agents deployed on each
the following research questions: node, ensuring minimal latency and high data accuracy. These
• How can dynamic task scheduling be optimised to adapt agents operate independently to gather metrics simultaneously,
in real-time to fluctuating resource demands in cloud which are then aggregated in real-time, allowing the system
environments? to efficiently capture performance variations across nodes.
• How can ACO be effectively integrated with deep learn- Integrating SmartMonit within our system ensures that we
ing techniques to enhance energy efficiency in cloud maintain an up-to-date understanding of resource consump-
computing? tion patterns, essential for optimising energy usage, reducing
By considering the limitations of the existing works and the carbon footprint, and enhancing the overall efficiency of our
current challenges in the field, our contributions are as follows: proposed EcoCloud framework.
• Integration of Deep Learning for Energy Prediction: B. Energy Model
We develop an MLP to accurately predict energy con- In cloud data centres, resource utilisation of computing
sumption based on real-time data to facilitate informed nodes, such as CPU, memory, disk, and network, greatly
decision-making for task scheduling. influence their power usage. The CPU uses the most energy
• Development of a Dynamic Task Scheduling Algo- compared to other system resources. Additionally, CPU util-
rithm: We propose an ACO-based task scheduling al- isation usually reflects the total load on the machine [11].
gorithm that adapts dynamically to real-time resource Hence, we concentrate on controlling and optimising its power
utilisation metrics to ensure optimal energy efficiency and consumption in this work.
reduce carbon footprint in cloud environments. 1) Resource Utilisation Computation: The optimal use of
The proposed system is detailed in §II, with the evalua- resources in the cloud significantly impacts the system’s
tion and discussion of experimental results presented in §III. overall performance and profitability. High resource utilisation
Finally, the conclusion is provided in §IV. ensures that the available computational resources are effec-
tively used, leading to increased profit and reduced energy
II. PROPOSED METHOD consumption by minimising idle resources. Therefore, a re-
source management technique should be evaluated based on
The proposed system comprises four main components: the resource utilisation using the following equation [18]:
Real-time Monitoring, Energy Model, Energy Consumption V
X RCPUj (t)
Prediction using Multilayer Perceptron (MLP), and ACO- Uy (G, t) = gjj × (1)
based Task Scheduling. The Real-time Monitoring is deployed j=1
CPUy
to collect logs from cloud-based big data systems to monitor Eq. 1 presents the utilisation Uy (G, t) of a server Sy at a
resource utilisation and the state of big data tasks. The Energy specific time t. G represents the placement of virtual machines
Model focuses on understanding and optimising the power (VMs), whereas gjj indicates whether a VM Vj is hosted on
consumption of computing nodes by monitoring and managing the server Sy . If Vj is hosted on Sy , then the value of gjj will
their resource utilisation, particularly the CPU, which is the be one; otherwise, it will be zero. CPUy represents the total
primary energy consumer. The Energy Consumption Predic- computation capacity of Sy , and RCPUj (t) is the amount of
tion using Multilayer Perceptron (MLP) component employs a CPU capacity required by Vj at a specified time.
neural network to accurately predict the energy consumption of
2) Energy Consumption Computation: As discussed, en-
nodes based on their resource usage data to facilitate informed
ergy efficiency is a critical metric in resource management, in-
decision-making for energy-efficient task scheduling. Finally,
fluenced by resource utilisation. High resource utilisation leads
the ACO-based Task Scheduling component leverages the ACO
to better energy efficiency by ensuring that the servers are
algorithm to assign MapReduce tasks to nodes to minimise
actively processing tasks rather than idling. Energy-efficient
energy consumption and balance the load across the cluster,
resource management techniques aim to minimise energy
guided by pheromone levels and heuristic information. Fig. 2
consumption while maintaining high performance. To compute
depicts the high-level implementation of EcoCloud in a large-
the power consumption of a specific server at a given time t
scale cloud environment for processing IoT data.
with a placement G, we can use the following equation [11]:
A. Real-time Monitoring
We employ SmartMonit [17], a sophisticated real-time big Py (G, t) = 0.7Pymax + 0.3Pymax × Uy (G, t) (2)
data monitoring tool, to effectively gather data from the big , where Pymax represents the maximum power consumed by a
data cluster in cloud environments. SmartMonit is designed server when it is fully utilised, and Uy (G, t) is the utilisation of
with an adaptive and dynamic pipeline that facilitates seamless the server at time t. To calculate the total energy consumption
data transmission from the big data cluster source to the between time t1 and t2 , the following equation is used:
time-series database, InfluxDB, which continuously monitors XS Z t2
resource utilisation and the state of each big data task to Energy(G, t1 , t2 ) = Py (G, t) dt (3)
t1
provide a comprehensive view of system performance. By k=1
leveraging SmartMonit, we can obtain real-time metrics for , where, S is the total number of servers, and G is the place-
making informed decisions about task scheduling and en- ment of the servers, while Py (G, t) is the power consumption
ergy management. SmartMonit collects data in parallel across of the y-th server at time t.
Authorized licensed use limited to: VTU Consortium. Downloaded on February 11,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
© 2025 IEEE. Personal use is permitted, but republication/redistribution requires IEEE [Link] [Link] for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2025.3537111
Fig. 2: Implementation of EcoCloud in cloud-based big data ecosystem for IoT data processing.
C. Energy Consumption Prediction using MLP phase, the error is propagated back through the network,
An MLP [19], a type of Artificial Neural Network (ANN), and gradients for the weights and biases are computed using
is structured with multiple layers: an input layer, one or more the chain rule. This forward and backward propagation cycle
hidden layers, and an output layer, where each layer consists continues until all epochs are completed, resulting in a trained
of nodes (neurons) fully connected to neurons in the next set of weights that can accurately predict energy consumption
layer. Fig. 3 illustrates the structure of the MLP used for from the input data. The induced local field of neuron j in layer
predicting energy consumption based on resource utilisation k, denoted as vkj , is calculated using the following equation:
metrics. The red circles represent bias terms (bjk ) at each
nk−1
layer, which adjusts the output of the weighted sum of inputs. X
vkj = uij
k · zi
k−1
+ bjk , (4)
Additionally, wij and yk indicate the weights and outputs in
i=1
each layer, respectively. These additions distinguish our model
from standard deep learning architectures by providing a more , where nk−1 indicates the number of neurons in the (k −
detailed view of the internal components and highlighting the 1)th layer, uij
k specifies the weight connecting neuron i in the
role of biases and weight connections in the MLP structure. (k − 1)th layer to neuron j in the k th layer, zik−1 represents
the activation of neuron i in the (k − 1)th layer, and bjk is the
Input layer y1
Hidden layers Output layer bias for neuron j in the k th layer.
i (1) wij(L- y1
(L-1) o
wij(1) 1) wij(L) The activation zkj of neuron j in layer k is obtained by
Input 1 y2(1) y2(L-1) applying an activation function ψ to the local field vkj :
Output 1
Input 2
y3(1) y3(L-1) zkj = ψ(vkj ). (5)
Output m
Input n For the input layer (k = 0), the activation values are simply
yn1 yh(L-1)
(1) the input features:
1) Training Process: Training the MLP involves two pri- Algorithm 1 Energy consumption prediction with MLP
mary steps: Forward Propagation: that computes activations Require: (X, T ): Training data,
for each layer starting from the input layer and moving forward η: Learning rate
p: Batch size,
to the output layer, Backward Propagation: that adjusts the E: Number of epochs
weights and biases using the backpropagation algorithm to Ensure: Trained weights uij k
minimise the error between the predicted and actual output 1: Initialise weights uij k randomly
values. During forward propagation, the error signal for each 2: for epoch = 1 to E do
3: for each batch (Xbatch , Tbatch ) in (X, T ) do
output neuron j in the output layer L is calculated as: 4: Forward Propagation:
5: for each training example x in Xbatch do
ej = t j − yj , (9) 6: z0 ← x
7: for layer k = 1 to L do
, where tj is the target value representing the actual energy 8: for neuron j = 1 to nk do
consumption. The local gradient δjL for neuron j in the output 9: vkj ← Weight sum of inputs and bias
layer is computed as: 10: zkj ← Activation of vkj
11: end for
12: end for
δjL = ej · ψ ′ (vjL ), (10) 13: end for
14: Backward Propagation:
, where ψ ′ is the derivative of the activation function. For a 15: for each training example x in Xbatch do
hidden layer 0 < k < L, the local gradient δkj is computed as: 16: Compute error ej ← tj − zjL
17: for neuron j = 1 to nL do
h
X 18: δjL ← Error gradient for output neuron
δkj =ψ ′
(vkj ) · m
δk+1 · ujm
k+1 , (11) 19: end for
m=1 20: for layer k = L − 1 to 1 do
21: for neuron j = 1 to nk do
, where h is the number of neurons in the (k + 1)th layer. The 22: δkj ← Weight sum of gradients from next layer
23: end for
weights uij
k are updated using the following rule: 24: end for
p 25: for layer k = 1 to L do
1 X j
26: for neuron j = 1 to nk do
uij ij
k (t + 1) = uk (t) − η · δk (r) · zik−1 (r) , (12) 27: for neuron i = 1 to nk−1 do
p r=1
28: uij
k (t + 1) ← Adjust using learning rate & gradient
, where t is the iteration number, p is the batch size, and η is 29: end for
j
30: bk (t + 1) ← Adjust using gradient of bias
the learning rate. 31: end for
32: end for
33: end for
D. ACO-based Task Scheduling 34: end for
35: end for
Ant Colony Optimisation (ACO) is a metaheuristic inspired 36: return uijk
by the foraging behaviour of ants. This algorithm is used
to find approximate solutions to combinatorial optimisation
problems. Fig. 4 depicts the task scheduling for the cases
Algorithm 2 leverages pheromone levels and heuristic in-
based on CPU utilisation. Tasks are allocated dynamically,
formation to efficiently assign MapReduce tasks to nodes in a
ensuring that fully utilised nodes, such as Node 2, do not
Hadoop cluster. During each iteration, artificial ants construct
receive additional tasks, even if sufficient memory capacity is
solutions by probabilistically selecting tasks and assigning
available. This adjustment reflects the scheduling algorithm’s
them to nodes based on the current pheromone levels and
prioritisation of CPU constraints over memory availability
heuristic values, reflecting the assignments’ energy efficiency.
in task allocation, depicting the EcoCloud system’s resource
After creating a complete solution, each ant evaluates the
management strategy.
total energy consumption. Pheromone evaporation is applied
to all pheromone trails, reducing their intensity over time.
Pending tasks
Pheromone is then deposited proportional to the quality of
Big data cluster
the solution, with better solutions receiving more pheromone.
Node 1
CPU This iterative process continues, refining the pheromone trails
Queue and improving the task-node assignments.
Resource 1) Initialisation: Let N be the number of tasks
CPU prediction (MapReduce jobs) to be scheduled, and M be the number of
with nodes available for task execution. We define:
Node 2 MLP
Authorized licensed use limited to: VTU Consortium. Downloaded on February 11,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
© 2025 IEEE. Personal use is permitted, but republication/redistribution requires IEEE [Link] [Link] for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2025.3537111
2) Path Construction: Each ant constructs a path by moving Algorithm 2 Ant Colony Optimisation for task scheduling
from one task to another based on a probabilistic decision rule. Require: N : Number of tasks,
The probability Pijk (t) of ant k moving from task i to task j M : Number of nodes,
α: Influence of pheromone,
is given by: β: Influence of heuristic,
( α β ρ: Pheromone evaporation rate,
P [τij (t)] [ηijα] if j ∈ allowed Q: Pheromone deposit factor,
k [τil (t)] [ηil ]β
Pij (t) = l∈allowed
τ0 : Initial pheromone level
0 otherwise Ensure: Task-to-node assignment to the best node
1: Initialise pheromone levels τij ← τ0 for all tasks i and nodes j
, where α and β are parameters that control the influence 2: for each iteration do
of pheromone and heuristic information, respectively, and 3: for each ant k do
4: Initialise ← empty solution Sk
j ∈ allowed is the set of nodes the ant has not visited yet. 5: Initialise ← list of unvisited tasks
6: Select ← a starting task randomly
3) Heuristic Information: The heuristic information ηij is 7: while there are unvisited tasks do
8: Select ← the next task i to assign to node j with probability
defined as the inverse of the energy consumption associated Pijk : Select ← the next task i to assign to node j
with assigning task i to node j, where Eij is the energy 9: Add ← task i to node j in solution Sk
consumption for these tasks: 10: Remove ← task i from the list of unvisited tasks
11: end while
1 12: Evaluate ← Sk and compute total energy consumption E k
ηij = 13: end for
Eij 14: Pheromone Evaporation:
4) Pheromone Update: After all ants have constructed their 15: for each task i and node j do
16: τij ← (1 − ρ)τij
paths, the pheromone levels are updated. The pheromone 17: end for
evaporation is applied first, where ρ is the evaporation rate: 18: Pheromone Update:
19: for each ant k do
τij (t + 1) = (1 − ρ)τij (t) 20: for each task i assigned to node j in solution Sk do
21: τij ← τij + EQk
Next, pheromone is deposited by the ants based on the quality 22: end for
23: end for
of their solutions: 24: end for
m
X 25: return Best node
k
τij (t + 1) = τij (t + 1) + ∆τij (t)
k=1
k consumption values across various makespans, from approxi-
, where m is the number of ants and ∆τij (t) is the amount
of pheromone deposited by ant k: mately 45 to 65 watts. Power consumption appears to cluster
(
Q
around two distinct regions: 50 to 60 and 60 to 65 watts. This
k k if ant k uses edge (i, j) clustering suggests that resource utilisation is grouped based
∆τij (t) = E
0 otherwise on task complexity, where simpler tasks require less power,
while more complex tasks lead to higher power consumption
Here, Q is a constant and E k is the total energy consumption levels. As the makespan increases from 55 to 90 units, the
of the solution constructed by ant k. spread of power consumption narrows slightly, suggesting that
longer makespans might lead to more consistent power usage
III. E VALUATION AND R ESULT A NALYSIS patterns. This consistency might stem from improved load
A. Experimental Setup balancing over extended execution times, reducing fluctuations
1) IoT Dataset: To evaluate the proposed system, we in power consumption. Fig. 6 presents the feature coefficients
employ the CIC IoT Dataset 2023 [20], developed by the for dynamic power consumption, focusing on CPU usage,
Canadian Institute for Cybersecurity, designed for profiling, memory usage, and network traffic, indicating that increases in
behavioural analysis, and vulnerability testing of IoT devices, CPU and memory usage raise power consumption, with CPU
and it includes various benign and malicious activities across usage having the most substantial impact, while the network
different IoT protocols. traffic coefficient is near zero, showing that it has a minimal
2) Experimental Testbed: To facilitate distributed process- effect on power consumption. The dominant role of CPU usage
ing, we deploy a Hadoop cluster on AWS, consisting of one highlights its critical influence on power dynamics, aligning
master node and three worker nodes that configured with 4 with the intuitive understanding that computationally intensive
CPUs and 16 GB of memory, running the Ubuntu system. tasks directly drive energy consumption. Fig. 7 shows total
energy consumption values ranging from approximately 8400
to 9800 joules as a function of the makespan. Like the power
B. Evaluation the Energy Model consumption data, energy consumption exhibits clustering,
As discussed in §II-B, we consider the energy consumption with most data points falling between 8800 and 9400 joules.
based on CPU, memory, and network utilisation. The eval- This clustering reflects typical task execution profiles, where
uation of our energy model involves analysing the dynamic resource usage is optimised for standard workloads. There is
power consumption and total energy consumption concerning a noticeable trend where energy consumption increases with
the makespan. Fig. 5 demonstrates various dynamic power makespan, reflecting the direct relationship between execution
Authorized licensed use limited to: VTU Consortium. Downloaded on February 11,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
© 2025 IEEE. Personal use is permitted, but republication/redistribution requires IEEE [Link] [Link] for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2025.3537111
Coefficient
0.4
0.3
0.2
0.1 0.05
0.0
CPUusage Musage networkTraffic
Features
Fig. 5: Power consumption evaluation Fig. 6: Feature coefficients Fig. 7: Energy consumption evaluation
Metric Value
300 10.0
Loss
200 7.5
5.0
100 3.35
2.5
0.60
0 0.0
0 20 40 60 80 100 R^2 Score MAE MSE
Epochs Metrics
Classification Metrics
0.84 0.82 0.85 0.84
0.8
0.6
Metric Value
0.4
0.2
0.0
Accuracy Precision Recall F1 Score
Fig. 11: Classification metrics Fig. 12: Makespan comparison Fig. 13: Energy consumption comparison
time and total energy usage. However, there are instances the model, ensuring its applicability in dynamic, real-world en-
where lower makespans correspond to higher energy consump- vironments. Fig. 9 shows the correlation between the predicted
tion, possibly due to suboptimal task scheduling leading to and actual makespan, indicating the prediction accuracy. The
inefficient resource utilisation. These outliers emphasise the predictions are relatively accurate compared to the true values,
importance of effective task scheduling algorithms to minimise though there is some spread, especially for makespans between
energy waste during shorter execution times. 65 and 80, indicating room for improvement in predicting
accuracy. This discrepancy can be caused by differences in
how difficult workloads are versus how they actually perform
C. Evaluation of the Energy Consumption Prediction
when two workloads are co-run, where different workloads
To evaluate informed decision-making for task scheduling, can contend for resources at runtime, causing them to be
this section presents the results of the MLP developed to noisy predictors. Finally, Fig. 10 gives us some key regression
accurately predict energy consumption based on real-time data metrics that help us understand how well the model has
in large-scale clusters. As seen from Fig 8, the training loss performed. They are R2 Score, Mean Absolute Error (MAE),
decreases steadily and stabilises around 50 epochs, reaching and Mean Squared Error (MSE). A R2 Score of 0.60 means
a minimum value below 100. The validation loss follows 60% of the dependent variable’s variance can be explained or
a similar trend, although it stabilises at a slightly higher predicted from the independent variables, showing a moderate
value than the training loss, indicating that the model learns fit with MAE of 3.36 and MSE of 17.45.
effectively and generalises well to new data without overfitting.
The convergence of the loss curves reflects the robustness of
Authorized licensed use limited to: VTU Consortium. Downloaded on February 11,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
© 2025 IEEE. Personal use is permitted, but republication/redistribution requires IEEE [Link] [Link] for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2025.3537111
Authorized licensed use limited to: VTU Consortium. Downloaded on February 11,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
© 2025 IEEE. Personal use is permitted, but republication/redistribution requires IEEE [Link] [Link] for more information.