Task Allocation Algorithm and Optimization Model On Edge Collaboration
Task Allocation Algorithm and Optimization Model On Edge Collaboration
com/science/article/pii/S1383762120300722
1
Manuscript_dc2378ee0f51d06d8d51051bc060eafd
Abstract—This paper investigates a mobile edge computing network load pressure [5], and privacy data protection[6].
environment for video analysis tasks where edge nodes pro- In addition, with the rapid development of IoT technology
vide their computation capacities to process the computation and the emergence of various applications that are closely
intensive tasks submitted by end users. First, we introduce
a Cloudlet Assisted Cooperative Task Assignment (CACTA) related to location [7], the data generated by edge device
system that organizes edge nodes that are geographically close that is privacy-sensitive and time-sensitive grows rapidly
to an end user into a cluster to collaboratively work on the [8][9]. Edge computing can take its advantages to meet these
user’s tasks. It is challenging for the system to find an optimal requirements and provide a better user experience, and will
strategy that assigns workload to edge nodes to meet the also play an indispensable role in future computing models
user’s optimization goal. To address the challenge, this paper
proposes multiple algorithms for different situations. Firstly, [10]. In fact, there are abundant mobile terminal resources
considering the situation that historical data cannot be obtained, around the end user for collaborative computing and storage,
a multi-round allocation algorithm based on EMA prediction is and the development of 5G communication technology, which
proposed, and the experimental results prove the efficiency and will also enable the edge collaborative computing to become
necessity of multiple rounds of transmission. To address the a reality [11][12]. Taking video analysis as an example, When
second case of obtaining historical data, this paper introduces
a prediction-based dynamic task assignment algorithm that we need to find lost children or old people, or track suspects,
assigns workload to edge nodes in each time slot based on the video analysis and processing is needed. we can allocate the
prediction of their capacities/costs and an empirical optimal allo- captured videos to a number of nearby end users rather than
cation strategy which is learned from an offline optimal solution cloud center to find the target person as early as possible, thus
from historical data. Experimental results demonstrate that our avoiding the long transmission time required for transmission
proposed algorithm achieves significantly higher performance
than several other algorithms, and especially its performance to the cloud, and the disclosure of privacy.
is very close to that of an offline optimal solution. Finally, Edge computing is a new emerging paradigm that utilizes
we propose an online task assignment algorithm based on Q- computing resources at the network edge to deploy hetero-
learning, which uses the model-free Q-learning algorithm to geneous applications and services. In an edge computing
actively learn the allocation strategy of the system, and the system, applications such as Internet of Things can offload
experimental results verify the superiority and effectiveness of
this algorithm. computation-intensive tasks to nearby computing resources
to meet the increasing demand for high computation power
Index Terms—Edge Computing, Task Assignment, Prediction, and low latency [13][14][15]. The computation and storage
Q-learning, Optimization Algorithm
servers in an edge computing system [16], referred to as edge
nodes, which have the following three major features: firstly,
I. I NTRODUCTION edge nodes are highly heterogeneous [17], they might be edge
Driven by the Internet of Things (IoT), the total amount servers with strong computing capability, or mobile terminals
of data created (and not necessarily stored) by IoT will reach such as tablets and mart phones, ranging from embedded
847 ZB per year by 2021, while in 2016 it was 218 ZB per computing devices to micro data centers. Secondly, edge
year [1]. Edge computing has attracted great attention from nodes are highly mobile as they might belong to different
academics and industry [2][3], and its main idea is to push owners who might be highly mobile. Residence time of
computing resources closer to the location of the mobile user. a mobile device at a particular physical area (like coffee
By leveraging devices located on the edge of the network and shops) is uncertain, but there are also some rules to follow.
away from the cloud, various challenges in cloud computing In addition, edge nodes are highly dynamic [18], unlike
models can be solved, such as real-time requirements [4], cloud computing, edge nodes in an edge computing system
An earlier version of this paper was presented at the 2018 IEEE GLOBECOM Conference and was published in its Proceedings, DIO:10.1109/GLOCOM.2018.8647199
© 2020 published by Elsevier. This manuscript is made available under the Elsevier user license
https://www.elsevier.com/open-access/userlicense/1.0/
2
might be owned by different entities, which brings additional by the video processing application are then used in
uncertainty to the provisioning of computing resources in an the CACTA system and combined with the empirical
edge system. The highly heterogeneity, mobility and dynamic dataset of the Google Cluster to complete the experiment
nature make edge computing nodes differ from traditional [28]. The experimental results show that the proposed
cloud computing environment where computation resources online prediction algorithm can achieve near-optimal
are centrally managed and controlled [19]. Therefore, edge performance in the environment where the edge node
nodes actual computation capacities and costs may vary over has high computation capability and cost predictability,
time. Given the uncertainty and dynamic nature of edge and its performance is not worse than other algorithms
computing, it is quite challenging to design a system to find in the setting of poor predictability.
an optimal strategy that assigns workload to nearby edge • This paper proposes an online task assignment algorithm
nodes to meet the users optimization goal. In a previous work based on Q learning, which adopts the classical rein-
[20], we proposed a prediction-based assignment optimization forcement learning method to achieve real-time resource
algorithm to find an optimal strategy that assigns workload load balancing and joint optimization. The long-term
to edge nodes in an edge computing environment. This cumulative reward is maximized by setting the reward
manuscript presents an extension of our previous paper. A function to the weighted sum of the completion time and
multi-round allocation algorithm is proposed to supplement the cost of the system. The experimental results show
the Prediction-based Optimal Algorithm (PA-OPT) in the that the algorithm is effective and outperforms other
conference paper, and a task online assignment algorithm existing algorithms.
based on Q learning is proposed. The rest of this paper is organized as follows. Related
To address the challenge, we develop a Cloudlet Assist- work is summarized in Section II. We present in Section III
ed Cooperative Task Assignment (CACTA) edge computing the system architecture and the proposed optimization model.
system, a type of proximity-based Mobile Crowd Service Two prediction-based dynamic algorithms are proposed in
system [21], that organizes edge nodes that are geographi- Section IV. Section V describes the task online assignment
cally close to an end user into a cluster to collaboratively algorithm based on Q learning. The performance evaluations
work on the users tasks. Different from existing work (e.g., of our proposed algorithms are presented and discussed in
[22][23][24][25][26][27]), our system allows an end user to Section VI. Section VII concludes this paper.
specify his/her desired level of quality of service, and then
we propose multiple algorithms for different situations, so II. R ELATED WORK
that the system can adopt a workload assignment algorithm
to satisfy the user’s quality of service requirement. The main 2015-2017 is a period of rapid growth in edge comput-
contributions of the present paper are as follow: ing, which has attracted close attention from industry and
academics at home and abroad. Next, we will usher in
• This paper designs a Cloudlet Assisted Cooperative Task the steady development of edge computing. At present, the
Assignment (CACTA) edge computing system, which main research directions are dynamic scheduling strategies,
adopts a method of dynamically allocating workloads programming models, computational execution frameworks,
to edge nodes through multiple time slots to optimize close integration with vertical industries, and the landing of
task completion time or to jointly minimize the total prototype systems.
task completion time and system cost. Resource management and collaboration: In terms of re-
• Considering the uncertainty and stochastic nature of the source management and collaboration, the existing work is
edge computing system, we formulate the optimization mainly divided into two scenarios. One is to try to allocate
problem by modeling the computation capacities and the computing tasks that are offloaded from edge nodes to mul-
costs of edge nodes as time-varying variables. Then, tiple edge clouds, and the other is to balance the distribution
the offline optimal solution is solved and used as a of edge clouds (ie, Cloudlets) and multiple edge nodes by
benchmark solution for evaluating the performance of considering objective factors (such as energy, overhead) [29].
online algorithm. You et al. considered the priority of the user’s channel
• To address the situation where long-term historical data revenue and locally calculated energy consumption in mobile
is not available to the system, a multi-round alloca- edge computing, and allocates resources to multiple users
tion algorithm based on EMA prediction (PA-EMA) is under the same delay constraint [30]. Wang et al. proposed
designed. This algorithm verifies the efficiency of the spectrum and other resources and Internet content cache
multi-round allocation strategy. optimization models to maximize the user’s revenue in mobile
• In order to cope with the second case of obtaining edge computing systems and encourage users to combine the
historical data, this paper proposes a Prediction-based way of computing locally and offloading local computing
Optimal Algorithm (PA-OPT), which can achieve near- to the cloud [23]. In the choice of calculation method, liu
optimal performance. The PA-OPT algorithm predicts a et al. solved this selection problem by using the Markov
node’s computation capability and cost based on histor- decision model, and the order of scheduling is based on the
ical data and the data observed during task execution queue state of the task buffer [31]. Various machine learning-
through the Autoregressive Integrated Moving Average based approaches for MEC are also summarized in [32]. In
(ARIMA) technique. The actual parameters obtained [33], the joint task unloading and resource allocation problem
3
certain amount of workload (e.g., a number of subtasks) to large enough that a task with a finite number of subtasks can
each edge node. Then, in the following rounds, it assigns a always be accomplished through the system. Here we assume
number of subtasks to edge nodes based on the prediction that Rm is sufficiently large so that a task with a finite number
or estimation of the capacities and available times of those of subtasks can always be completed by the system. Note that
edge nodes in future rounds. The predication is carried out via a CACTA system may allocate its resources to multiple end
the prediction module based on historical data. The predicted users. Without loss of generality, we focus on one end user
results will be utilized by the allocation module to determine and remove the notation m for the rest of our discussion.
the assignment of workload to edge nodes via some optimal We implement a total cost metric Ctotal as a joint equation
allocation strategy. of the cost and the latency of executing a task. It is defined
In contrast to the tightly managed and controlled computing as follows.
nodes in cloud computing, the edge computing nodes’ capac- Ctotal = αCalg + βTalg (1)
ities and their stay or presence times in the system are often
difficult to know in advance, due to the following reasons. Note that Calg includes the following costs: (1) Computation
They may be owned by different entities. They may have cost. (2) Storage cost. An edge node will ask the client for a
just joined the system at the time when the cloudlet attempts payment per unit task that it execute and keep in local storage,
to assign tasks to them, thus no historical data is available. and this payment might vary over time. (3) Data transmission
Even for an edge node that has stayed in the system for a cost. Transmitting a certain amount of subtasks from the client
long time, its fraction of CPU and memory that is available to an edge server node will incur some transmission cost.
for the system may fluctuate over time, because the edge Due to the above reasons, the system will need to make an
node is also concurrently utilized in other computing tasks appropriate allocation of subtasks to all edge nodes in order
(assigned by their owners) that are outside the system, or to save cost. A low value of Ctotal corresponds to a high
the node might simply leave the system due to the mobility system performance. Note that α(≥ 0) and β(≥ 0) are the
nature of its owner. Therefore, it is important for the system weighting factors to tradeoff cost and task’s completion time.
to predict or estimate the capacities of edge nodes in future If β > α, which means that the user put more attention on task
in order to assign appropriate workload to them. Details of completion time than on cost. Therefore, let W be the size of
our prediction technique will be presented later. data, we can characterize a user’s task J as J , (W, α, β).
Because of the fact that neither the system or the users can
fully know the capacities of all edge nodes at all future time
B. System Model slots, the system can only estimate or predict those values.
This section provides the description of the CACTA system Let b̂jt denote the estimated capacity of node j at time slot
model that we have used to implement our task offloading al- t. The number of subtasks that the system assigns to edge
gorithm. We consider a set of end users M = {1, 2, ..., UM }, node j is denoted by njt , which is upper bounded by b̂jt , i.e.,
where task requests are sent to a cloudlet. Each end user m njt ≤ b̂jt , and it is not necessarily true that njt = b̂jt because
has a computation task Jm . The mathematical variables used the system attempts to minimize the total cost metric, not just
throughout this work are summarized in Table I. Suppose to minimize task completion time. The number of subtasks
that the cloudlet of a CACTA system allocates the task actually completed in round t is denoted by b̃jt , which is the
Jm to a set of edge nodes or servers, which denoted by smaller one of njt and bjt , i.e., b̃jt = min{njt , bjt }.
S m = {1, 2, ..., Nsm }. Each edge node has a certain amount of If we know Talg , then Calg can be computed as follows
computation resources. When an end user sends a computing
task request to the cloudlet nearby, the cloudlet works as a ∑ ∑
Ns Talg
TABLE I 7500
PARAMETERS = 6, =3
7000 = 5, =5
Symbol Definition
M The set of end users 6500
Jm The computation task of end user m
Sm The set of edge nodes
6000
Rm
C total
The set of time slots work on task Jm
τ Time slot length
cm The cost of edge node j in a round t with t ∈ Rm 5500
jt
bm
jt The computation capacity of edge node j in round t
Cm The set of cm 5000
jt
Bm The set of bmjt
Ctotal The total cost metric 4500
The actual number of time slots to complete the task
Talg 4000
via algorithm alg
Calg The cost incurred by algorithm alg to complete the task 0 200 400 600 800 1000
bˆm The estimated capacity of node j at time slot t Time slot
jt Fig. 3. Ctotal varies along task completion time T , with 1000*1000 frames.
njt The number of subtasks the system assigns to edge node j
comp
cjt The computation cost
ctran
jt The transmission cost
cstor
jt The storage cost we find the T ∗ that corresponds to the minimum total cost
∗ ∗
Ctotal . Note that Ctotal is a lower bound for all algorithms
that do not have the system’s complete information.
where Calg ({njt }) is given by (2); and (4) shows that njt is We first assume that Tcost only and Ttime only is the upper
upper bounded by the estimated capacity; and (5) means the bound and lower bound of T ∗ . The upper bound Tcost only
total number of assigned subtasks is equal to the actual total is the time that minimizes the cost only in Eqn. (3) and the
number of subtasks. lower bound Ttime only is the minimum time to finish all
subtasks without considering cost. To find Tcost only , we only
Challenge. Note that Talg , i.e., the number of terms to
assign subtasks to the lowest cost node k in each round t by
add together in getting Calg , is essentially determined by
the following rule: nkt = bkt if ckt ≤ cjt , ∀j, (other nodes
decision variables {njt }, thus it is challenging to solve
with higher cost will not receive any subtask), till we finish
the above non-linear optimization problem, because Talg
the assignment of all subtasks. To find Ttime only , we let
(the number of summation terms) appears in the objective
njt = bjt , ∀j till all subtasks’s assignment is finished. Since
function. Furthermore a solution to the above optimization
T ∗ is the solution that minimizes the combination of comple-
problem might not complete all subtasks in practice, because
tion time and cost, we have Ttime only ≤ T ∗ ≤ Tcost only .
b̃jt may not match bjt the actual capacities of edge nodes,
Note that for any T ∈ [Ttime only , Tcost only ], Eqn. (5) can
i.e., b̃jt = min{njt , bjt }. always be satisfied and the feasible solution space of the
To solve the challenge, we initially propose an offline op-
optimization problem given by (3)-(6) is not empty. Thus,
timization algorithm to obtain the optimal solution n∗jt of (3)
for each T ∈ [Ttime only , Tcost only ], we can solve (3)-(6)
under the assumption of knowing the complete information,
to get a minimized Ctotal (T ) (which is a function of T ),
i.e., we know all cjt ’s and all bjt ’s. Then, in order to obtain
and then choose T ∗ that gives the lowest Ctotal (T ∗ ). It is
an online simple and effective solution, we develop an online
interesting to note that Ctotal is a smooth convex function of
algorithm based on prediction through the following steps. We
T in [Ttime only , Tcost only ], thus there must exist a unique
first utilize ARIMA technique to predict bjt and cjt in a real
T ∗ that minimizes Ctotal in the range [Ttime only , Tcost only ].
edge computing environment where the capacity and cost of
We next illustrate the above optimization procedure via an
edge node j in each time slot t are stochastic and unknown
example. To describe the example, we solve the optimization
in advance. Then back to the historical dataset, we find a
problem with complete system information by using edge
subset whose bjt ’s and cjt ’s have the highest similarity with
nodes’ computation capacities sampled from Google dataset
the bjt ’s and cjt ’s that we predict, which means that it has the
[28]. Fig. 3 shows an example where an image processing task
least square error among all the subsets in comparing with our
has W = 4×105 image frames or subtasks to be processed by
predicted values. Next, we derive an optimal solution of those
edge nodes. For different combination of constants α and β,
subsets, and let the system learn a regression model to make
there is always a minimum Ctotal . Take α = 6, β = 3 as an
the optimal task assignment n∗jt as the function of bjt and cjt .
example, we can observe that Ttime only is 4 and Tcost only
It is worth noting that the learned regression model, referred
is 860, then Ctotal is minimized when T is about 200.
to as an estimated optimal allocation strategy, is based on
the historical dataset that records observed bjt and cjt in the
past, which is then applied in our working system to allocate IV. P REDICTION - BASED DYNAMIC A LGORITHM
workload to each node at time t. For an dynamic edge computing environment where edge
nodes’ capacities are stochastic. The basic idea of a dynamic
D. Offline Optimization with Complete System Information algorithm is to assign workload to edge nodes in multiple
Consider an ideal case where the system has a full knowl- stages over time. A stage can include multiple time slots or
edge of bjt and cjt of all the edge nodes in all time slots. Then a single time slot. In each stage, the algorithm predicts all
6
edge nodes capacities in the stage (which might include a Algorithm 1 PA-EMA
single time slot or multiple time slots), and decides workload Input: A task with size W ; Edge node set S = {1, 2, ..., Ns }
assignment via Exponential Moving Average (EMA) tech- Output: n∑ jt workload
∑ assigned to node j in time slot t, with
nique or a decision strategy that specifies the assignment W = j t njt j ∈ S, t ∈ R
as a function of the predicted capacities and costs of edge 1: while W assigned < W do
nodes. The algorithm keeps updating its prediction based on 2: for each round t ∈ {1, 2, ..., Nr } do
its observation of the actual capacities of the edge nodes in 3: for each edge node j ∈ {1, 2, ..., Ns } do
the past, and it keeps generating new workload assignment 4: AvailableTimeOfNodes()
of the remaining workload based on its most recent predic- 5: //Judge whether edge node is in collaborative
tion. Considering whether the system can obtain long-term group
historical data, we introduce two dynamic algorithms: PA- 6: Predict b̂j(t+1) using EMA model according to
EMA algorithm (Prediction-based Assignment based on EMA Eq.(8)
algorithm), and PA-OPT algorithm (Prediction-based Assign- 7: Let nj(t+1) = b̂j(t+1) and assign the workload of
ment Optimization algorithm), PA-EMA algorithm predicts nj(t+1) subtasks to edge node j
the nodes’ capacities in the next round by simply using 8: end for
the observed capacities in the current round based on EMA 9: end for
technique. PA-OPT algorithm is based on the prediction of 10: end while
edge nodes’ computation capacities/costs and an estimated
optimal allocation strategy learned from historical dataset.
The objective of a prediction is to minimize the error of B. Prediction-based Assignment Optimization Algorithm
estimating an edge node’s capacities in future rounds, given
This algorithm addresses a situation where edge nodes of
below:
a system have stayed in the system for a long time, and the
min |bjt − b̂jt |, ∀j ∈ S, ∀t ∈ R (7) system has a record of the historical data of those edge nodes’
computing capacities.
where b̂jt is predicted capacity.
Note that each edge node asks for a payment for a subtask
In the next, we first present two algorithms, and then that it works on. This payment can be regarded as the cost to
discuss the prediction and learning components about PA- the end user who submits the task. The cost of each edge node
OPT algorithm in details. varies over time in practice. For example, an edge node with
high available CPU capacity might ask for a high payment,
A. Prediction-based Assignment based on Exponential Mov- i.e., different tasks have different timeliness requirements, and
ing Average a stronger computing capability of the edge node means a
higher cost to the end user.
This algorithm addresses a situation where long-term his- The algorithm utilizes ARIMA technique [43] to predict b̂jt
torical data is not available to the system. Note that edge and ĉjt based on historical dataset, and decides its decision
nodes are highly mobile, and the time spent in a certain strategy via learning from historical data to make the optimal
area is uncertain, and it is possible to leave the area at any task assignment njt as the function of cjt and bjt , i.e., njt =
time. Secondly, an edge node may participate in multiple f ∗ (bjt , cjt ). Then the estimated strategy is denoted by
cooperation groups, and may also occupy part of the resources
in order to meet its own needs, so the computing performance n̂jt = fˆ∗ (b̂jt , ĉjt ) (9)
of a node is fluctuant with time. In addition, edge nodes have
PA-OPT Algorithm is shown in Algorithm 2, so it can do
high heterogeneity, which may be edge servers with strong
a workload assignment to edge nodes in multiple rounds in
computing capability, and may also be mobile terminals such
future. The basic idea of the algorithm is as follows.
as tablets, smart phones, etc., so their storage resources and
It predicts or estimates all edge server nodes’ capacities
computing resources are different.
in multiple future time slots using ARIMA model, i.e., b̂jt ,
In this case, the system in time t predicts the available
and then computes njt using regression model according to
capacity of an edge node j in the next round t + 1, denoted
Eq.(10) and assign the workload of njt subtasks to edge node
by b̂j(t+1) via EMA technique [42]. Based on the previous
j. Recall this algorithm will give us a minimal total cost
t rounds capacities of edge node j, i.e., b1 , b2 ,...,bt , We can
according to Eq.(3) and a task completion time T . Since all
get b̂t+1 , i.e., its expected capacity at time t + 1., as follows
edge nodes’ capacities are estimated, the task completion time
b̂t+1 = r · bt + (1 − r) · b̂t (8) is also estimated, denoted as T̃ , and within T̃ , the task might
not be completed in practice due to the difference between
where r is the smoothing index and adjusted empirically to actual nodes’ capacities and estimated capacities. Thus, at the
achieve the optimal accuracy. The term of which is closer to end of T̃ , if there are still subtasks left, we repeat the previous
the current time, the weight is greater. PA-EMA algorithm steps till all subtasks are completed. A formal description of
conducts workload assignment based on the predicted capac- the algorithm is given in Algorithm 2.
ity of each node, that is, it lets nj,t+1 = b̂t+1 , and it is shown In practice, the computation capacities of all edge nodes in
in Algorithm 1. all time slots are not able to be known by the system. Also,
7
17: Wjremaining ← Wjremaining − bj,T̃ +Tj i.e., bt+1|t ,...,bt+z|t . Through computing the one-step ahead
18: Tj ← Tj + 1 prediction with ARIMA iteratively, we can obtain the multi-
19: end while step prediction results bt+z .
20: end for The z th step prediction bt+z|t can be expressed as
21: T ← T̃ + max{Tj }, note that T can only be obtained
bt+z|t = F (bt+z−p|t , bt+z−p+1|t , ..., bt+z−1|t ) (11)
once all nodes finish the task together
22: return T , C and njt , ∀j, ∀t where function F is the prediction model, and p is the number
of lags. For bt+1 , if it is stationary for each t, which can be
described as ARMA (p, q), the time series is given by
a system can only get the information of all edge nodes in
bt+1 = φ0 bt + ... + φp−1 bt−p+1 +
a certain time slot. To assign tasks to the edge nodes in a (12)
certain time slot with the information of all nodes in this time at+1 − θ0 at+1 + ... + θq−1 at−q+1
slot, we analyze the data obtained by the offline algorithm, in where the φc , c = 1, 2, ..., p, and θh , h = 1, 2, ...q are the co-
which the system knows all the information of all nodes at the efficients estimated from a training set. The at is a white noise
beginning, and model the number of subtasks njt assigned to series, which is subject to normal distribution whose mean is
edge node j by a function which is a nonlinear combination zero and variance σ 2 is unknown. The parameter p and q are
of the computation capacity bjt and cost Calg . the order of AR (Autoregressive Model) and MR (Moving
To address the nonlinear optimization problem given in (3), Average Model) respectively. If the time series shows non-
we propose a functional relationship (as the optimal allocation stationary, we use ARIMA (p, d, q) model to describe the
strategy) between the optimal assignment solution n∗jt and the bt . Let η be the backward shift operator, which is defined as
system parameters bjt , cjt , shown below. η c b(t) = b(t−c) . Then the formula (1 − η)bt = bt − bt−1 can
be derived. By denoting the ∇ differential operator, an order
bjt
n∗jt = f ∗ (bjt , cjt ) = a1 + a2 cjt + a3 bjt + a4 (10) d difference series can be expressed as following
cjt
d
∇d bt = (1 − η) bt (13)
The optimal assignment is determined by capacity bjt and
cost cjt . Furthermore, it is related to the capacity per unit After the time series have been differenced d time to obtain
cost, i.e., bjt /cjt . the stationary time series, the ARIMA can be given by
We can find the optimal n∗jt given a set of system pa-
rameters bjt and cjt according to Section III-D. As the ∑
p−1 ∑
q−1
regression model is a mathematic model, njt computed by (1 − φc η c )(1 − η)d bt = (1 + θc η c )at (14)
c=0 c=0
it can be a negative number; if it happens, we replace the
negative number by zero. Note that the regression model is To measure the accuracy of the prediction using ARIMA
related to the parameters (W, α, β), thus different settings model, we introduce Mean Squared Error (MSE) to represent
8
1.2
node 1
Assignment. dynamic online algorithms considered include
node 2 Short-Term Greedy Assignment and PA-EMA.
1 node 3 The basic idea of one-time static algorithms is that the
node 4 system divides a whole submitted task into multiple workload
node 5
node 6 shares, and assigns a workload share to each edge node before
0.8
MSE Deviation
execute the task. The reason for not splitting is that the There are three key elements in the reinforcement learning
video file is not too large and it is easy to retain information method, namely state space, action space, and reward. The
such as the geographical location. This section considers scheduling node is equivalent to the Agent, which can observe
the continuous arrival of tasks and the dynamic changes of the status of all edge nodes and tasks.
resource-providing nodes. A task online assignment algorithm State space: In this section, the state of the system, that is,
based on Q learning is proposed for video analysis tasks. the current allocation and task state of the collaboration group
Based on the above two algorithms, two main points are is represented by independent dicing, as shown in Fig. 5. The
considered and improved. squares in the figure represent the number of different node
• Not only the dynamic changes of available nodes over resources (mainly considering CPU) occupied by multiple
time are considered, but also the real-time impact on tasks in multiple time slots from the current time. Squares
node resources after task assignment is considered. of different colors represent different tasks. For example, the
• While performing the task assignment online, it is not green box on the far left indicates the resources of one unit
necessary to predict the computation capability and cost of CPU in the three slots of node 1. The rightmost task queue
of the edge nodes in the future to achieve the joint in the figure represents the waiting or upcoming task. To sim-
optimization of completion time and system cost like plify the model, only one task per slot is arrived. This section
PA-OPT algorithm. uses three levels to divide the computation capability bjt of
different nodes, which are low level (CL=1), intermediate
level (CL=2) and advanced level (CL=3), corresponding to
A. System Model the number of video frames processed per second is less than
Considering a collaboration cluster that has Ns edge nodes 5 frames, between 5 frames and 10 frames and greater than
at the edge of the network. Each node has different computing 10 frames, which can effectively reduce the space complexity.
resources, such as CPU and memory. The set of available edge Therefore, one state of the system s = (B ⃗ t , wi ) contains
nodes is represented as E = {1, 2, ..., m}. Here, the system two parts, namely the computation capability level vector
time is discretized into multiple time slots, and all tasks arrive B⃗ t = b1t , b2t , ..., bmt and the task size wi of all nodes in the
at the edge nodes in an online manner. This section assumes time slot t. The state space is represented as S and the size is
that only one task arrives in each time slot, which means that |S| = 3m · n. Incorporating the size of tasks into the state can
the scheduling strategy selects one task for scheduling in each better achieve adaptability and sensitivity to upcoming tasks.
time slot. The task set W = {w1 , w2 , ..., wn } represents the Action space: In each time slot, the scheduling node
size of each task wi . When the resources required for the task assigns a task to a node. Then an action is defined ∑m as a =<
computation exceed the available resources of node, the node aji >, ∀i ∈ 1, 2, ..., n, ∀j ∈ 1, 2, ..., m, and j=1 aji = 1,
will compute the task with the weakest computing capability ∀i, which means that a task can only be assigned to one node.
until the occupied resources are released. Consistent with the Therefore, the action space A contains all feasible actions,
above, the computation capability of edge node is heteroge- and the space size is |A| = mn , but since only one task is
neous. The computation capability here refers to the number allocated in one time slot, the action space is reduced to the
of processing video frames. The more the number of video linear space m.
frames processed per unit time, the stronger the computation Reward: The reward is developed to guide the Agent to
capability and the greater the computation cost of the node. find a good solution for the system goal: minimize the weight-
Assuming that the number of frames computed by the edge ed sum of completion time and system cost. Specifically,
node j in the time slot t and the computation cost per frame this paper defines ri = −(κ · Twi + (1 − κ) · Cwi ) as the
are bjt and cjt , respectively, the computation time Twi to reward for each time slot. The Agent will receive a reward at
∑Tw
complete task wi satisfies the equation wi = t=1i bjt and each time step, so this section simply defines the reward as
∑Twi the negative of the weighted sum of the computational time
the computation cost is Cwi = t=1 cjt · bjt . Similar to
previous paper, this section assumes that the size and the and the computational
∑T cost, i.e., maximizing the cumulative
resource required of each task is known when the task arrives. rewards t=1 rt .
Despite many assumptions, this simple model better describes
the scheduling between multitasking and multi-resources and B. Algorithm Description
proves the effectiveness of reinforcement learning in resource
Reinforcement learning is mainly divided into two types:
scheduling under not cumbersome settings. Therefore, the
model-based and model-free, in which Q-learning is model-
system goal of this section is to minimize the completion
free, without prior learning environment and modeling [50].
time and system cost of all tasks, and to introduce weight
The model-free type can be divided into an online algorithm
factor κ in order to balance the relationship between the two
and an offline algorithm. The offline algorithm may learn
parts. As shown below:
previous experience or other people’s experience, such as
∑
n
Q-learning. The Sarsa algorithm is an online algorithm that
M inimize (κ · Twi + (1 − κ) · Cwi ) (17) selects states and actions before the next time slot begins.
i=1
Simply, the computation capability of the node and task
where κ represents the weight of the task computation time amount observed by the Agent are taken as input, and the
and κ ∈ [0, 1]. action of the next time slot is performed by the Agent
10
15 15
70
Uniform Assignment
PA-EMA
60 12
10
Frames/Second
50 9
Round
40
6
5
30
3
20
0 0
35 40 45 50 55 60 65 70 75 80 85 90 95 100 105
10 CPU Usage (%)
PA-EMA
computer. The data is sent through WiFi to the edge server.
The edge sever tested in this paper is Intel Core i7-6560U,
1.00
with a CPU of 2.20 GHz and a memory of 2048 MB.
Transmission efficiency
capacity, and its value is between 0.01 and 1. The storage cost 13 PA-EMA
Random Assignmnet
12
and data transmission cost follow the two uniform distribu-
{offline algorithm}
Short-term Greedy Assignment
11 Uniform Assignment
tions U (0.01, 0.6) and U (0.01, 0.3) respectively. Therefore, 10
FairRatio Assignment
total
6
2) Predictability of edge nodes’ capacities and costs:
{alg}/C
5
We evaluate the performance in three different prediction 4
settings: 1) highly predictable setting, in which the computing 3
total
capacities and costs of edge nodes can be predicted fairly
C
2
accurately; 2) poorly predictable setting, in which it is almost 1
0
impossible to predict the capabilities and costs of nodes. 3) 1000 2000 4000 6000 8000 10000 12000
mixed setting, in which there are approximately the same Number of Frames(*1000)
Random Assignmnet
{offline algorithm}
Short-term Greedy Assignment
11
node take the average capacity value of this node in Google Uniform Assignment
FairRatio Assignment
10
data set. For poorly predictable setting, we let an edge node’s 9
computation capacity be a random number sampled from an 8
uniform distribution. Since the capacity that a node can take 7
total
5
of min and max, it is difficult to predict its capacity in a
4
time slot. For mixed setting, we randomly sample a set of 3
total
2
highly predictable nodes and some poorly predictable nodes. 1
0
In addition, we choose another set of 58 nodes in Google 1000 2000 4000 6000 8000 10000 12000
dataset as a set of nodes with reasonably high predictability by Number of Frames(*1000)
F(x)
0.4 0.4 sizes, and its cr,alg is very close to 1. When the task size
increases, the proposed PA-OPT algorithm is still close to
0.2 0.2
the optimal performance, while the performance of other
0
0 0.1 0.2 0.3 0.4 0.5
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4
algorithms become worse. These results are not surprising,
x x
because using On-Off model, CACTA system can predict
the capacities/costs of edge nodes, and learn the estimated
Fig. 9. Empirical CDF of predic- Fig. 10. Empirical CDF of predic- optimal allocation strategy quite accurately. Here we use 100
tion errors of a set of 58 Google tion errors of a set of 100 Google
nodes. nodes. nodes and set α and β to be 6 and 3 respectively.
It is worth noting that PA-OPT algorithm still performs
3) Evaluation Results: Figure 11 shows the results of better than other algorithms in the setting where edge n-
On-Off model , where the x-axis represents the number of odes’ capacities/costs are poorly predictable. However if we
frames or subtasks in a video processing (face recognition) compare Figure 12 with Figure 11, we can find that the
application. Its y-axis represents the competitive ratio of performance of PA-OPT is worse in the setting with poor
the six algorithms. Note that the competition ratio of an predictability, which is reasonable, because the performance
Ctotal,alg
algorithm alg is defined as cr,alg = Ctotal,of f line
, where of PA-OPT depends on the prediction accuracy.
13
2.0
work, we will explore the application of machine learning
C
1.0
to 0.9. When the random number is greater than 0.9, the
Agent will randomly select the action instead of selecting
0.5 the action that corresponds to the maximum Q value. Also,
the discount factor γ = 0.9, the learning rate α = 0.01,
0.0 and the initial values of the Q table are zeros. It is worth
1000 2000 5000
Nunber of Frames(*1000)
noting that the algorithm is aimed at video analysis tasks.
The computation capability and task size discussed in this
Fig. 14. Performance of a set of 100 Google nodes. chapter are all in frames. In the process of the experiment,
the relationship between the CPU utilization and the number
of processed frames in Fig. 8 is used as a reference for
This paper further conducts experiments based on the set of the conversion between the computation capability level and
58 Google nodes with fairly reasonable predictability of ca- the number of processed frames, and a regression equation
pacities/costs and the mixed predictability set of 100 Google is obtained. In addition, assuming that each task has the
nodes. The performance comparison of four algorithms using same request for resources, which is 0.5 computing units,
these two sets are shown in Fig. 13 and Fig. 14. Uniform and that is, 0.5 computation capability. When the computing
Fair Ratio algorithms are not shown here as their competitive resources are fully occupied, the node will run with the lowest
ratios are very high. computation capability. Similarly, the cost required per frame
First of all, it is noted that PA-OPT achieves the best is proportional to the actual computation capability.
competition ratio among all algorithms for a set of 58 nodes. As shown in Fig. 15, when there are tasks with six different
Interestingly, when the task size is 1000, the performance sizes coming in different time slots, all the task completion
of PA-EMA and random algorithm is not much worse than time and computation cost are obtained by adjusting the
that of PA-OPT. This is because that even the aggregate weighting factor κ. It can be clearly seen from the figure
prediction errors of the 58 nodes is low, there is still a that as κ increases, that is, the higher the delay requirement,
large predictability randomness among nodes. When task size the actual task completion time obtained according to the
gets larger (W = 5000), the performance of PA-OPT gets QL algorithm allocation result is continuously reduced, and
further improved. For a mixed environment (i.e., the set of the system cost is gradually increased. The completion time
100 nodes), the four dynamic algorithms show approximately and system cost are almost equal when κ = 0.4. It can be
the same competitive ratio when W = 1000, our PA-OPT seen that the Agent effectively learned the feedback from the
algorithm still performs better than other algorithms, but it environment and proved the rationality of the return function
performs not well when W = 2000 and W = 5000. The setting. Since the learning effect of the Agent is related to the
Greedy algorithm’s performance gets degraded significantly number of iterations of the training, this section compares the
14
600 3500
QL Time
Completion Time
QL Cost
System Cost
3000
(k=0.5)
Random Time
500 Random Cost
400
2000
1500
300
1000
200
500
100 0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 30 45 60 75 90 105 120 135 150
k Number of Tasks
Fig. 15. The completion time and the system cost vary along weighting Fig. 17. Objective function value of the three algorithms for different
factor κ in QL algorithm. number of tasks.
[29] X. Chen, H. Zhang, C. Wu, S. Mao, Y. Ji, and M. Bennis, “Performance Xiaoheng Deng received the Ph.D. degrees in
optimization in mobile-edge computing via deep reinforcement learn- computer science from Central South University,
ing,” in 2018 IEEE 88th Vehicular Technology Conference (VTC-Fall). Changsha, Hunan, P.R. China, in 2005. Since 2006,
IEEE, 2018, pp. 1–6. he has been an Associate Professor and then a Full
[30] C. You, K. Huang, H. Chae, and B.-H. Kim, “Energy-efficient resource Professor with the Department of Electrical and
allocation for mobile-edge computation offloading,” IEEE Tran. on Communication Engineering, Central South Uni-
Wireless Communications, vol. 16, no. 3, 2017. versity. He is the Chair of RS Changsha Chapter, a
[31] J. Liu, Y. Mao, J. Zhang, and K. B. Letaief, “Delay-optimal compu- senior member of CCF, a member of CCF Pervasive
tation task scheduling for mobile-edge computing systems,” in 2016 Computing Council, a member of IEEE and ACM.
IEEE International Symposium on Information Theory (ISIT). IEEE, He has been a chair of CCF YOCSEF CHANG-
2016, pp. 1451–1455. SHA from 2009 to 2010. His research interests
[32] B. Cao, L. Zhang, Y. Li, D. Feng, and W. Cao, “Intelligent offloading in include wireless communications and networking, congestion control for
multi-access edge computing: A state-of-the-art review and framework,” wired/wireless network, cross layer route design for wireless mesh network
IEEE Communications Magazine, vol. 57, no. 3, pp. 56–62, 2019. and ad hoc network, online social network analysis, edge computing.
[33] I. Ketykó, L. Kecskés, C. Nemes, and L. Farkas, “Multi-user com-
putation offloading as multiple knapsack problem for 5g mobile edge
computing,” in EuCNC 2016. IEEE, 2016.
[34] A.-C. Pang, W.-H. Chung, T.-C. Chiu, and J. Zhang, “Latency-driven
cooperative task computing in multi-user fog-radio access networks,”
in ICDCS 2017. IEEE, 2017.
[35] X. Ma, S. Zhang, W. Li, P. Zhang, C. Lin, and X. Shen, “Cost-efficient
workload scheduling in cloud assisted mobile edge computing,” in
IWQoS 2017. IEEE, 2017. Jun Li is a M.Sc. student in school of Computer
[36] D. Sabella, V. Sukhomlinov, L. Trang, S. Kekki, P. Paglierani, R. Ross- Science and Engineering of Central South Universi-
bach et al., “Developing software for multi-access edge computing,” ty, Changsha, China. She received the B.Sc. degrees
ETSI White Paper, vol. 20, 2019. in electronic information engineering from Hunan
[37] S. Bi and Y. J. Zhang, “Computation rate maximization for wireless Normal University, Changsha, China, in 2018. Her
powered mobile-edge computing with binary computation offloading,” major research interests are wireless network and
IEEE Transactions on Wireless Communications, vol. 17, no. 6, pp. edge computing.
4177–4190, 2018.
[38] V. De Maio and I. Brandic, “First hop mobile offloading of dag
computations,” in Proceedings of the 18th IEEE/ACM International
Symposium on Cluster, Cloud and Grid Computing. IEEE Press, 2018,
pp. 83–92.
[39] J. P. A. Neto, D. M. Pianto, and C. G. Ralha, “Mults: A multi-
cloud fault-tolerant architecture to manage transient servers in cloud
computing,” Journal of Systems Architecture, vol. 101, p. 101651, 2019.
[40] B. Cai and K. Li, “Slo-aware colocation: Harvesting transient resources
from latency-critical services,” Journal of Systems Architecture, vol.
101, p. 101663, 2019.
[41] J. Real, S. Sáez, and A. Crespo, “A hierarchical architecture for time- Enlu Liu received the M.Sc. degrees in information
and event-triggered real-time systems,” Journal of Systems Architecture, and communication engineering from Central South
vol. 101, p. 101652, 2019. University, Changsha, China, in 2019. Her major
[42] S. Di, D. Kondo, and W. Cirne, “Host load prediction in a google research interests are wireless network and edge
compute cloud with a bayesian model,” in ICHPC, 2012. computing.
[43] D. Yang, J. Cao, C. Yu, and J. Xiao, “A multi-step-ahead cpu load
prediction approach in distributed system,” in CGC 2012.
[44] M. Farahat and M. Talaat, “Short-term load forecasting using curve
fitting prediction optimized by genetic algorithms,” IJEE, vol. 2, no. 2,
2012.
[45] Q. Zhang, M. F. Zhani, S. Zhang, Q. Zhu, R. Boutaba, and J. L.
Hellerstein, “Dynamic energy-aware capacity provisioning for cloud
computing environments,” in Proceedings of the 9th international
conference on Autonomic computing. ACM, 2012, pp. 145–154.
[46] docker. http://www.docker.com. Accessed: 2017-03-10.
[47] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G.
Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski
et al., “Human-level control through deep reinforcement learning,”
Nature, vol. 518, no. 7540, p. 529, 2015. Honggang Zhang holds a PHD in Computer Sci-
[48] T. Li, Z. Xu, J. Tang, and Y. Wang, “Model-free control for distributed ence (2006) from the University of Massachusetts,
stream data processing using deep reinforcement learning,” Proceedings Amherst, USA. He received his BS degree from
of the VLDB Endowment, vol. 11, no. 6, pp. 705–718, 2018. Central South University of China, and his MS
[49] H. Mao, M. Alizadeh, I. Menache, and S. Kandula, “Resource man- degree from Tianjin University of China. He also
agement with deep reinforcement learning,” in Proceedings of the 15th received a MS degree from Purdue University, West
ACM Workshop on Hot Topics in Networks. ACM, 2016, pp. 50–56. Lafayette, IN, USA. He is currently an Associate
[50] H. Mao, R. Netravali, and M. Alizadeh, “Neural adaptive video Professor of Computer Engineering in the Engi-
streaming with pensieve,” in Proceedings of the Conference of the ACM neering Department at University of Massachusetts
Special Interest Group on Data Communication. ACM, 2017, pp. 197– Boston, Boston, MA, USA. His research interests
210. span a wide range of topics in the area of computer
networks and distributed systems. His current research focuses primarily
on Edge computing, Mobile Computing, and Internet of Things. He was
a recipient of the National Science Foundation (NSF) CAREER Award in
2009.