Makespan Optimisation in Cloudlet Scheduling With Improved
Makespan Optimisation in Cloudlet Scheduling With Improved
Scientific Programming
Volume 2021, Article ID 7216795, 11 pages
https://doi.org/10.1155/2021/7216795
Research Article
Makespan Optimisation in Cloudlet Scheduling with Improved
DQN Algorithm in Cloud Computing
Received 6 July 2021; Revised 28 September 2021; Accepted 1 October 2021; Published 21 October 2021
Copyright © 2021 Amine Chraibi et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Despite increased cloud service providers following advanced cloud infrastructure management, substantial execution time is lost due to
minimal server usage. Given the importance of reducing total execution time (makespan) for cloud service providers (as a vital metric)
during sustaining Quality-of-Service (QoS), this study established an enhanced scheduling algorithm for minimal cloudlet scheduling
(CS) makespan with the deep Q-network (DQN) algorithm under MCS-DQN. A novel reward function was recommended to enhance
the DQN model convergence. Additionally, an open-source simulator (CloudSim) was employed to assess the suggested work per-
formance. Resultantly, the recommended MCS-DQN scheduler revealed optimal outcomes to minimise the makespan metric and other
counterparts (task waiting period, resource usage of virtual machines, and the extent of incongruence against the algorithms).
outcomes by comparing TS measures (waiting time, Sun and Qi [15] proposed a hybrid tasks scheduler based on
makespan reduction, and enhanced resource usage). Local search and differential evolution (DE) to enhance the
The remaining sections are arranged as follows: Section 2 makespan and the cost metrics. The authors in the paper [16]
outlines pertinent literary works, Section 3 presents the DQN presented a parallel optimized relay selection protocol to
algorithm, Section 4 highlights the recommended work, minimise latency, collision, and energy for wake-up radio-
Section 5 explains the research experiment setup and sim- enabled WSNs.
ulation outcomes, and Section 6 offers the study conclusion.
3. Background Rewardt
......
(3) Policies of transitioning from states to actions
(4) Rules that identified the immediate reward scalar of a Action N
transition
Figure 2: Representation of deep Q-network.
(5) Rules that outlined agent perception
Cloud provider
Task 1
Cloud Broker
Task 2
Data Centers
Task 3
Tasks scheduler
Task waiting
Task n
queue
complete execution time of a set of cloudlets on vmj exe- 4.3. Model Training. The MCS-DQN model was retrained
cution. Figure 4 presents an example of the first-come first- for each episode in line with the workflow in Figure 6 as
served (FCFS) scheduling process where the number of follows:
virtual machines was 2 and the number of tasks was 7. Every
Step 1: the environment and agent contexts were
task encompassed varied time unit lengths. Notably,
established, including server, virtual machine, and
makespan denoted the most considerable execution time
cloudlet attributes.
between the aforementioned VMs. The makespan (com-
puted in VM2) was 45. Step 2: the environment state and cloudlet queues were
reset.
Step 3: the next cloudlet was selected from cloudlet
4.2. Environment Definition. This study regarded a system queues.
with multiple virtual machines and cloudlets. Every VM
encompassed specific attributes (processing power in MIPS, Step 4: the agent selected the following action in line with
memory in GB, and bandwidth in GB/s). As users submitted the existing environment state under ε factor. Essen-
distinct cloudlets that arrived in a queue, the broker tially, the ε factor (exploration rate) influenced the
implemented the defined scheduling algorithm to assign choice between exploration and exploitation in every
every cloudlet to an adequate VM. As the broker scheduling iteration. The possibility of an agent arbitrarily choosing
algorithm needed to make an assignment decision in every a VM (exploration) was 1 − ε while the possibility of the
cloudlet input from the queue, the system state was changed agent choosing a VM under the model (exploitation) was
in line with the decision. Figure 5 presents CS with a length ε. The ε factor (initialized by one) would reduce in every
of 3 to VM2 . iteration following a decay factor.
Step 5: the environment state was updated by adding
the cloudlet execution time to the chosen VM.
4.2.1. State Space. Only the time taken for each virtual
machine during a set of task execution was regarded in this Step 6: the environment produced a reward under the
study to support the defined system state identification recommended reward function in the following
process. The time counted on every virtual machine implied subsection.
the total cloudlet time running on VM. The virtual machine Step 7: the agent saved the played experience into the
running time facilitated makespan computation to enhance experience replay queue.
each novel cloudlet delegation where the system state Step 8: upon experience storage, the algorithm iden-
changed. In this vein, the t system state with n VMs was tified more cloudlets to schedule (to be repeated from
provided by βt (VM1 ), . . . , βt (VMi ), . . . , βt (VMn ). Specif- Step 3 if more cloudlets were determined).
ki
ically, βt (VMi ) � j�1 Exi (Cloudletj ), Exi (Cloudletj ) Step 9: the model was retrained in every episode
denoted the cloudlet j run time in VMi while ki implied the (completing all cloudlet queues) with a batch of defined
total cloudlets in VMi . Figure 5 presents the t state as {9, 7, 11} cloudlets from the experience queue. The experience
and t + 1 state as {9, 10, 11}. replay queue was applied as a FIFO queue. The oldest
experience was omitted when the queue reached a limit.
4.2.2. Action Space. Available agent actions were defined in Step 10: the algorithm was repeated from Step 2 if the
the action space. The broker scheduling algorithm was re- number of iterations was yet to reach the predefined
quired to choose a VM from all current VMs to schedule the episode limits.
existing task from the queue. For example, the agent would Step 11: the trained MCS-DQN model was saved and
make an action in the space with the same dimension as the exited.
number of VMs so that the action space denoted all VMs in
the system. The action space was outlined with n VMs by
{1, . . . , i, . . . , n}, wherein i denoted the VM index conceded 4.4. Reward Function. The recommended reward function
by the scheduler for cloudlet assignment. In Figure 5, action was utilised with the MCS-DQN model in Algorithm 1. The
space denotes {1, 2, 3}, while the chosen action implied is 2. makespan of every potential scheduling was first computed.
Scientific Programming 5
Burst time of
Serial Number VM1 1 (7) 3 (9) 5 (13) 7 (16)
task
1 7 0 7 16 29 45
2 3
VM2 2 (3) 4 (15) 6 (21)
3 9
0 3 18 39
4 15
5 13 Ex = 7 + 9 + 13 + 16 = 45
VM1
6 21 Ex = 3 + 15 + 21 =39
VM2
7 16 Makespan = Max (ExVM1, ExVM2) = 45
VM1 1 5 3 VM1 1 5 3
VM2 2 4 1 VM2 2 4 1 3
VM3 3 5 3 VM3 3 5 3
State t State t + 1
Figure 5: Example of system state change.
Start
Reset Cloudlet
queue
Input cloudlet
rand () < ε
Exploration
Exploitation
End
Predict next VM
Select next
using MCS-DQN
VM randomly
model
[false]
MCS-DQN reward
Remember current
experience (old state,
reward, new state)
Every VM was subsequently ranked following the makespan makespan (to decrease the highest score to be delivered to
computation during CS. A simple example was provided to the following makespan) (see Figure 7(e)). Lastly, the cor-
present the recommended MCS-DQN reward function (see responding reward was identified following the makespan
Figure 7). The example encompassed the reward compu- index and corresponding VM to be scheduled. In the study
tation for a specific VM state (elaborated following the total context, VM2 reflected the reward as 2.
execution time in every VMi ). Based on five VMs, every
VMi involved a set of cloudlet execution times {9, 7, 11, 8, 8}. 5. Results and Discussion
Specifically, a newly-arrived cloudlet was scheduled with a
length of five to VM2 in the example (see Figure 7(a)) by 5.1. Experimental Setup. The recommended trained model
iterating over VMs, creating a copy of VM state in every under deep Q-learning was assessed against FCFS and PSO
iteration, adding the cloudlet to the chosen VM in iteration, algorithms with the CloudSim simulator.
and computing the makespan following the added cloudlet.
Figure 7(b) presents the first iteration where the arrived
cloudlet was added to VM1. Figure 7(c) presents the 5.1.1. CloudSim Parameters. CloudSim is a modular sim-
computed makespans. For example, the makespan would be ulation toolkit for modelling and simulating cloud com-
14 when the cloudlet was added to VM1 in the first iteration, puting systems and application provisioning environments
13 when added to VM2, and so on. In every VMi ranking, [33]. It enables the modelling of cloud system components
the computed makespans were ranked following the lowest such as data centres, virtual machines (VMs), and resource
value by sorting the aforementioned makespans (see provisioning rules on both a system and behavioural level
Figure 7(d)) and providing the highest score to the lowest [33].
Scientific Programming 7
VM1 1 5 3 VM1 1 5 3 5
VM2 2 4 1 VM2 2 4 1
VM4 5 1 2 VM4 5 1 2
VM4 3 5 VM4 3 5
• Cloudlet to be scheduled 5
• Schedule cloudlet to VM 2
(a) (b)
makespan 1 14 makespan 2 12
makespan 2 12 makespan 5 13
makespan 3 16 makespan 4 13
makespan 4 13 makespan 1 14
makespan 5 13 makespan 3 16
(c) (d)
VM 2 2
VM 5 1
VM 1 0
VM 3 -1
(e)
Figure 7: Running example of the proposed MCS-DQN reward function. (a) The example initialisation. (b) Adding cloudlet to VM1. (c) The
calculated makespans. (d) The sorted makespans. (e) VM ranks.
The CloudSim simulator configuration in the imple- Center North (HPC2N) in Sweden(the HPC2N data:
mentation began with establishing one data center, two https://www.cse.huji.ac.il/labs/parallel/workload/
hosts, and five VMs with subsequent parameters (see l_hpc2n/). The data contain information about tasks such
Table 1). This configuration setup is taken from the ex- as the number of processors, the average CPU time, the
ample 6 of CloudSim code source available on GitHub used memory, and other task specifications. The utilised
(CloudSim codebase: https://github.com/Cloudslab/ tasks from the workload completely differ from the in-
cloudsim), which is based on real servers and VMs in- dependent counterparts employed in the trained model.
formation. At the VM level, a time-shared policy (one of
the two different scheduling algorithms utilised in
CloudSim) was selected. The time-shared policy facilitated 5.1.2. The MCS-DQN Model Parameters. The MCS-DQN
VMs and cloudlets towards immediate multitasks and model application employs a neural network with five fully
progress within the host. Moreover, the tasks data used in connected layers (see Figure 8): an input layer (for state),
the experiments are real-world workloads of real computer three hidden layers (64 × 128 × 128), and an output layer
systems recorded by the High-Performance Computing (for actions). The network was taken from an original Keras
8 Scientific Programming
Score
800 episodes. Perceivably, learning remained steady despite 420
approximately 800 training iterations. The ε parameter 0.4
evolution was also incorporated into the ε-greedy explora- 400
tion method during training. Following increased agent 0.2
scores when ε began decaying, MCS-DQN could already 380
generate sufficiently good Q-value estimates for more
0 200 400 600 800
thoughtful state and action explorations to accelerate the
agent learning process. Training Steps
After the training process, various cloudlet sets were Epsilon
executed with the MCS-DQN scheduler saved model, Score
FCFS, and PSO algorithms for every metric assessment. Figure 9: Training score.
As every cloudlet of the same set was simultaneously
executed, this study essentially emphasised the makespan
metric (the elapsed time when simultaneously executing congruent system. Equation (3) was employed in this
cloudlet groups on available VMs). Figure 10 presents the research to calculate the DI metric.
reduced research makespan compared to other Emax − Emin
algorithms. DI � . (3)
Eavg
The makespan metric (employed as the primary model
training objective) impacted other performance metrics:
Specifically, Eavg , Emin , and Emax implied the average,
(1) The degree of imbalance (DI) metric demonstrated minimum, and maximum total execution time of all
load-balancing between VMs. Specifically, DI was VMs [34]. Figure 11 presents the recommended
utilised to compute the incongruence between VMs MCS-DQN scheduler that minimised the DI metric
when simultaneously executing a set of cloudlets. in every utilised set of cloudlets for an enhanced
The DI metric reduction was attempted for a more load-balancing system.
Scientific Programming 9
1600
1.75
1.25
1000
1.00
800
600 0.75
400 0.50
200 0.25
0 0.00
20 60 100 140 180 200 20 60 100 140 180 200
Numbers of cloudlets Numbers of cloudlets
FCFS FCFS
PSO PSO
MCS-DQN MCS-DQN
Figure 10: Makespan. Figure 11: Average DI.
0.4
Data Availability
The data for this research are available in the “Parallel
0.2 Workloads Archive: HPC2N Seth”: https://www.cse.huji.ac.
il/labs/parallel/workload/l_hpc2n/.
0.0
20 60 100 140 180 200 Conflicts of Interest
Numbers of cloudlets
The authors declare no conflicts of interest.
FCFS
PSO
MCS-DQN Authors’ Contributions
Figure 13: Average resource utilisation. All of the authors participated to the article’s development,
including information gathering, editing, modelling, and
reviewing. The final manuscript was reviewed and approved
1600
by all of the authors.
1400
1200 Acknowledgments
Makespan (seconds)
600
References
400
[1] I. Foster, Y. Zhao, I. Raicu, and S. Lu, “Cloud computing and
200 grid computing 360-degree compared,” in Proceedings of the
2008 grid computing environments workshop, pp. 1–10, IEEE,
0
60 140 200 60 140 200 60 140 200 60 140 200 Austin, Texas, November 2008.
[2] M. A. Rodriguez and R. Buyya, “A taxonomy and survey on
10 VMs 15 VMs 20 VMs 30 VMs scheduling algorithms for scientific workflows in iaas cloud
Number of tasks computing environments,” Concurrency and Computation:
Practice and Experience, vol. 29, no. 8, Article ID e4041, 2017.
PSO
[3] J. Kennedy and R. Eberhart, “Particle swarm optimization,” in
MCS-DQN
Proceedings of the ICNN’95-International Conference on
Figure 14: Increasing the number of virtual machines. Neural Networks, IEEE, Perth, WA, Australia, December
1995.
[4] W. Wong and C. I. Ming, “A review on metaheuristic al-
6. Conclusion gorithms: recent trends, benchmarking and applications,” in
Proceedings of the 2019 7th International Conference on Smart
This study encompassed effective CS application using deep Computing & Communications (ICSCC), pp. 1–5, IEEE, Miri,
Q-learning in cloud computing. Additionally, the MCS- Malaysia, June 2019.
DQN scheduler recommended TS problem enhancement [5] H. S. Al-Olimat, M. Alam, R. Green, and J. K. Lee, “Cloudlet
and metric optimisation. The simulation outcomes revealed scheduling with particle swarm optimization,” in Proceedings
that the presented work attained optimal performance for of the 2015 Fifth International Conference on Communication
minimal waiting time and makespan and maximum re- Systems and Network Technologies, pp. 991–995, IEEE,
source employment. Additionally, the recommended algo- Gwalior, India, April 2015.
rithm regarded load-balancing during cloudlet distribution [6] N. Zhang, X. Yang, M. Zhang, Y. Sun, and K. Long, “A genetic
to current resources beyond PSO and FCFS algorithms. This algorithm-based task scheduling for cloud resource crowd-
funding model,” International Journal of Communication
proposed model can be applied to solve task scheduling
Systems, vol. 31, no. 1, Article ID e3394, 2018.
problems in cloud computing, specifically in cloud broker. [7] Q. Guo, “Task scheduling based on ant colony optimization in
To solve the limitation of fixed VMs, we plan in the future to cloud environment,” AIP conference proceedings, vol. 1834,
enhance our work by relying on variable-length output p. 040039, 2017.
prediction using dynamic neural networks to include var- [8] X. Huang, C. Li, H. Chen, and D. An, “Task scheduling in
ious VM sizes, as well as adding other optimisation cloud computing using particle swarm optimization with time
Scientific Programming 11
varying inertia weight strategies,” Cluster Computing, vol. 23, Communications in China (ICCC), pp. 835–840, IEEE,
no. 13, pp. 1–11, 2019. Chongqing, China, August 2020.
[9] Y. Liang, Q. Cui, L. Gan, Z. Xie, and S. Zhai, “A cloud [23] M. Chen, T. Wang, S. Zhang, and A. Liu, “Deep reinforcement
computing task scheduling strategy based on improved learning for computation offloading in mobile edge com-
particle swarm optimization,” in Proceedings of the 2020 2nd puting environment,” Computer Communications, vol. 175,
International Conference on Big Data and Artificial Intelli- pp. 1–12, 2021.
gence, pp. 543–549, New York, NY, USA, April 2020. [24] P. Wang, C. Zhao, Y. Wei, D. Wang, and Z. Zhang, “An
[10] M. S. Ajmal, Z. Iqbal, M. B. Umair, and M. S. Arif, “Flexible adaptive data placement architecture in multicloud envi-
genetic algorithm operators for task scheduling in cloud ronments,” Scientific Programming, vol. 2020, Article ID
datacenters,” in Proceedings of the 2020 14th International 1704258, 2020.
Conference on Open Source Systems and Technologies [25] Z. Zang, W. Wang, Y. Song et al., “Hybrid deep neural
(ICOSST), pp. 1–6, IEEE, Lahore, Pakistan, December 2020. network scheduler for job-shop problem based on convolu-
[11] N. Musa, A. Y. Gital, F. U. Zambuk, A. M. Usman, tion two-dimensional transformation,” Computational In-
M. Almutairi, and H. Chiroma, “An enhanced hybrid genetic telligence and Neuroscience, vol. 2019, Article ID 7172842,
algorithm and particle swarm optimization based on small 2019.
position values for tasks scheduling in cloud,” in Proceedings [26] J. Wu, G. Zhang, J. Nie, Y. Peng, and Y. Zhang, “Deep re-
inforcement learning for scheduling in an edge computing-
of the 2020 2nd International Conference on Computer and
based industrial internet of things,” Wireless Communications
Information Sciences (ICCIS), pp. 1–5, IEEE, Sakaka, Saudi
and Mobile Computing, vol. 2021, Article ID 8017334, 2021.
Arabia, October 2020.
[27] Y. Wang, H. Liu, W. Zheng et al., “Multi-objective workflow
[12] N. Yi, J. Xu, L. Yan, and L. Huang, “Task optimization and
scheduling with deep-q-network-based multi-agent rein-
scheduling of distributed cyber-physical system based on
forcement learning,” IEEE access, vol. 7, pp. 39974–39982,
improved ant colony algorithm,” Future Generation Com-
2019.
puter Systems, vol. 109, pp. 134–148, 2020. [28] V. Mnih, K. Kavukcuoglu, D. Silver et al. “Human-level
[13] Z. Peng, B. Barzegar, M. Yarahmadi, H. Motameni, and control through deep reinforcement learning,” Nature,
P. Pirouzmand, “Energy-aware scheduling of workflow using vol. 518, no. 7540, pp. 529–533, 2015.
a heuristic method on green cloud,” Scientific Programming, [29] S. Jain, P. Sharma, J. Bhoiwala et al., “Deep q-learning for
vol. 2020, Article ID 8898059, 2020. navigation of robotic arm for tokamak inspection,” in Pro-
[14] C. Saravanakumar, M. Geetha, S. Manoj Kumar, ceedings of the International Conference on Algorithms and
S. Manikandan, C. Arun, and K. Srivatsan, “An efficient Architectures for Parallel Processing, pp. 62–71, Springer,
technique for virtual machine clustering and communications Xiamen, China, December 2018.
using task-based scheduling in cloud computing,” Scientific [30] S. He and W. Wang, “Pricing qoe with reinforcement learning
Programming, vol. 2021, Article ID 5586521, 2021. for intelligent wireless multimedia communications,” in
[15] Y. Sun and X. Qi, “A de-ls metaheuristic algorithm for hybrid Proceedings of the ICC 2020-2020 IEEE International Con-
flow-shop scheduling problem considering multiple re- ference on Communications (ICC), pp. 1–6, IEEE, Dublin,
quirements of customers,” Scientific Programming, vol. 2020, Ireland, June 2020.
Article ID 8811391, 2020. [31] L. J. Lin, “Reinforcement learning for robots using neural
[16] C. Huang, G. Huang, W. Liu, R. Wang, and M. Xie, “A parallel networks,” Tech. rep., Carnegie-Mellon Univ Pittsburgh PA
joint optimized relay selection protocol for wake-up radio School of Computer Science, Pittsburgh, 1993.
enabled wsns,” Physical Communication, vol. 47, Article ID [32] A. Karthick, E. Ramaraj, and R. G. Subramanian, “An efficient
101320, 2021. multi queue job scheduling for cloud computing,” in Pro-
[17] J. Ge, B. Liu, T. Wang, Q. Yang, A. Liu, and A. Li, “Q-learning ceedings of the 2014 World Congress on Computing and
based flexible task scheduling in a global view for the internet Communication Technologies, pp. 164–166, IEEE, Trichir-
of things,” Transactions on Emerging Telecommunications appalli, India, March 2014.
Technologies, vol. 32, Article ID e4111, 2020. [33] R. N. Calheiros, R. Ranjan, A. Beloglazov, C. A. F. De Rose,
[18] D. Ding, X. Fan, Y. Zhao, K. Kang, Q. Yin, and J. Zeng, “Q- and R. Buyya, “Cloudsim: a toolkit for modeling and simu-
learning based dynamic task scheduling for energy-efficient lation of cloud computing environments and evaluation of
cloud computing,” Future Generation Computer Systems, resource provisioning algorithms,” Software: Practice and
vol. 108, pp. 361–371, 2020. Experience, vol. 41, no. 1, pp. 23–50, 2011.
[19] B. Zhang, W. Wu, X. Bi, and Y. Wang, “A task scheduling [34] M. Kalra and S. Singh, “A review of metaheuristic scheduling
algorithm based on q-learning for wsns,” in Proceedings of the techniques in cloud computing,” Egyptian informatics journal,
International Conference on Communications and Networking vol. 16, no. 3, pp. 275–295, 2015.
in China, pp. 521–530, Springer, Shanghai, China, December [35] M. Plappert: keras-rl, 2016, https://github.com/keras-rl/keras-
2018. rl.
[20] H. Che, Z. Bai, R. Zuo, and H. Li, “A deep reinforcement
learning approach to the optimization of data center task
scheduling,” Complexity, vol. 2020, Article ID 3046769, 2020.
[21] T. Dong, F. Xue, C. Xiao, and J. Li, “Task scheduling based on
deep reinforcement learning in a cloud manufacturing en-
vironment,” Concurrency and Computation: Practice and
Experience, vol. 32, no. 11, Article ID e5654, 2020.
[22] F. Qi, L. Zhuo, and C. Xin, “Deep reinforcement learning
based task scheduling in edge computing networks,” in
Proceedings of the 2020 IEEE/CIC International Conference on