0% found this document useful (0 votes)
12 views

Latency and Quality-Aware Task Offloading in Multi-Node Next Generation

Uploaded by

LUCAS qc
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Latency and Quality-Aware Task Offloading in Multi-Node Next Generation

Uploaded by

LUCAS qc
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Computer Communications 184 (2022) 107–117

Contents lists available at ScienceDirect

Computer Communications
journal homepage: www.elsevier.com/locate/comcom

Latency and quality-aware task offloading in multi-node next generation


RANs✩ , ✩✩
Ayman Younis ∗, Brian Qiu, Dario Pompili
Electrical and Computer Engineering, Rutgers University–New Brunswick, NJ, USA

ARTICLE INFO ABSTRACT


Keywords: Next-Generation Radio Access Network (NG-RAN) is an emerging paradigm that provides flexible distribution
NG-RAN of cloud computing and radio capabilities at the edge of the wireless Radio Access Points (RAPs). Computation
Tasks offloading at the edge bridges the gap for roaming end users, enabling access to rich services and applications. In this
Convex optimization
paper, we propose a multi-edge node task offloading system, i.e., QLRan, a novel optimization solution for
OpenAirInterface (OAI)
latency and quality tradeoff task allocation in NG-RANs. Considering constraints on service latency, quality
Testbed
loss, edge capacity, and task assignment, the problem of joint task offloading, latency, and Quality Loss of
Result (QLR) is formulated in order to minimize the User Equipment (UEs) task offloading utility, which is
measured by a weighted sum of reductions in task completion time and QLR cost. The QLRan optimization
problem is proved as a Mixed Integer Nonlinear Program (MINLP) problem, which is a NP-hard problem.
To efficiently solve the QLRan optimization problem, we utilize Linear Programming (LP)-based approach
that can be later solved by using convex optimization techniques. Additionally, a programmable NG-RAN
testbed is presented where the Central Unit (CU), Distributed Unit (DU), and UE are realized by USRP boards
and fully container-based virtualization approaches. Specifically, we use OpenAirInterface (OAI) and Docker
software platforms to deploy and perform the NG-RAN testbed for different functional split options. Then, we
characterize the performance in terms of data input, memory usage, and average processing time with respect
to QLR levels. Simulation results show that our algorithm performs significantly improves the network latency
over different configurations.

1. Introduction using a generic-computing platform. Hence, the edge cloud node has
ability to execute the offloading applications in close proximity to
Motivation: Mobile platforms (e.g., smartphones, tablets, IoT mo- end users. In this way, the network end-to-end (e2e) latency and
bile devices) are becoming the predominant medium of access to the back/mid/fronth-haul cost will be reduced. Recently, Cloud Radio
Internet services due to a tremendous increase in their computation Access Network (C-RAN) [2] has been emerged as a clean-slate redesign
and communication capabilities. However, enabling applications that of the mobile network architecture in which parts of physical-layer
require real-time, in-the-field data collection and mobile platform pro- communication functionalities are decoupled from distributed, possibly
cessing is still challenging due to (i) the insufficient computing capa- heterogeneous, Radio Access Points (RAPs), i.e., BSs or WiFi hotspots,
bilities and unavailable aggregated/global data on individual mobile
and are then consolidated into a baseband unit pool for centralized
devices and (ii) the prohibitive communication cost and response time
processing. However, the centralized C-RAN design follows a ‘‘one size
involved in offloading data to remote computing resources such as
fits all" architectural approach, which makes it difficult to address the
cloud datacenters for centralized computation. In light of these lim-
wide range of Quality of Service (QoS) requirements and support dif-
itations, the edge computing term was introduced to unite telco, IT,
and cloud computing and provide cloud services directly from the ferent types of traffic [3]. Also, a fully centralized architecture imposes
network edge. In general, the edge cloud servers or nodes are usu- high capacity requirements on fronthaul links [4]. Therefore, Next
ally deployed directly at the mobile Base Stations (BSs) of a Radio Generation RANs (NG-RAN) [5] has been introduced as a resource-
Access Network (RAN), or at the local wireless Access Points (APs) efficient solution to address the above issues and reduce deployment
costs. It is worthy of note that, due to the flexibility of NG-RAN architecture,
✩ A preliminary/shorter version of this work appeared in the Proc. of the IEEE/IFIP Wireless On-demand Network Systems and Services Conference (WONS),
Mar’21 [1].
✩✩ This work was supported by the US NSF Grant No. ECCS-2030101.
∗ Corresponding author.
E-mail addresses: [email protected] (A. Younis), [email protected] (B. Qiu), [email protected] (D. Pompili).

https://doi.org/10.1016/j.comcom.2021.11.026
Received 30 July 2021; Received in revised form 27 October 2021; Accepted 29 November 2021
Available online 24 December 2021
0140-3664/© 2021 Elsevier B.V. All rights reserved.
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117

mobile network operators will have high degree of freedom to move from • We provide a set of tools to deploy the NG-RAN mobile network.
a ‘‘full centralization’’ in C-RAN to a ‘‘partial centralization’’ in NG-RAN To explore the virtualization in the 5G system, we assign sev-
with a specific functional splitting option to a ‘‘distributed approach’’ in edge eral OpenAirInterface (OAI) [9] containers composing of a RAN
cloud [6]—enabling rich services and applications in close proximity to the and the core of the 5G system. Specifically, we implement a
end users. programmable testbed to demonstrate a connection between UE,
Task offloading can enhance the performance of mobile devices RAN, and Evolved Packet Core (EPC) implemented in the NG-RAN
because servers in the edge cloud have higher computation capabilities virtualization environment. The real-time experiments are carried
than mobile devices. Therefore, enabling task offloading in NG-RAN out under various configurations in order to profile functional
is proposed to address the limitations (e.g., storage and computing splitting, the data input, memory usage, and average processing
resources) in the existing RANs. Meanwhile, in some cases, processing time with respect to QLR levels.
the entire input data in edge cloud servers would require more than the • We provide formal proofs on the convergence and optimality
available computing resources to meet the desired latency/throughput
of our algorithm and evaluate its performance under different
guarantees. In the context of NG-RAN applications (e.g., IoT, AR/VR),
network conditions. In terms of computing capacity and num-
transferring, managing, and analyzing large amounts of data in an edge
ber of tasks, the numerical results show that latency can be
cloud would be prohibitively expensive. Hence, the tradeoff between
reduced while decreasing the QLR level under practical physical
service latency and the tolerance of quality loss can improve key
constraints.
network performance metrics like the user’s QoS [7,8]. In this paper,
we define the Quality Loss of Results (QLR) term as the level of Paper Organization: The remainder of this article is organized
relaxing/approximating in data processing while the user’s QoS is still as follows. The related work is introduced in Section 2. Section 3
at an acceptable level. Accordingly, our key idea is motivated by the includes system overview in terms of functional split options and task
observation that in several NG-RAN applications such as media processing, offloading process. We present the system model in Section 4. The
image processing, and data mining, a high-accuracy result is not always QLRan problem is formulated in Section 5, followed by presenting a lin-
necessary and desirable; instead, obtaining a suboptimal result with low ear programming-based solution for QLRan optimization problem. The
latency cost is more acceptable by vendors or end users. Consequently,
performance evaluation is discussed in Section 6; finally, we conclude
relaxing QLR in such applications alleviates the required computation
the paper in Section 7.
workload and enables a significant reduction of latency and computing
cost in NG-RAN.
2. Related work
Our Vision: Our objective is to design a holistic decision-maker for
an optimal joint task offloading scheme with quality and latency aware-
ness in a multi-edge NG-RAN to minimize the UEs’ overall offloading In this section, we introduce the key concepts and papers from both
cost. Specifically, we consider a multi-edge node network where each industry and academia over the past several years.
RAP is equipped with an edge node to provide computation offloading
services to UEs. In this way, several key benefits could be brought 2.1. Related concepts and technologies
up to NG-RAN system over the multi-node servers; (i) preventing
the resource-limited edge node/servers from becoming the bottleneck. Several cloud-based task offloading frameworks have been proposed
Usually, the cloud servers overload when serving a large number of in recent years. For example, Mobile Cloud Computing (MCC) has
UEs with high processing priority. By directing many UEs to nearby been proposed as a cloud-based network that can provide mobile
edge nodes, the overloaded can be alleviated; (ii) reducing the energy devices with significant capabilities such as storage, computation, and
consumption and network latency. Each UE has the capability to offload task offloading to a centralized cloud [10]. However, MCC has faced
its task to the RAP with a more favorable uplink channel condition; several noticeable challenges to address the mobile next generation
(iii) getting better network collaboration. The NG-RAN with multi- in terms of end-to-end network latency, coverage, and security. To
RAP set could coordinate with each other to manage and balance the tackle these challenges, Multi-access Edge Computing (MEC) has been
computation resources between the edge servers. In this work, a Latency introduced by European Telecommunications Standards Institute (ETSI)
and Quality tradeoffs task offloading problem, QLRan, is formulated to as an integration of the edge cloud computing systems and wireless
trade off between the service latency and the acceptable level of QLR under mobile networks [11]. One of the key–value features of MEC is to
specific application requirements (e.g., QoS, computing, and transmitting enable rich services and applications in close proximity to end users.
demands). Additionally, the process of task allocation across edge nodes is
With the MEC paradigm, mobile devices have options to offload their
formulated as an objective optimization problem. The optimization objectives
computation-intensive tasks to a MEC server to meet the demanding
include both minimizing the average service latency and reducing the overall
Key Performance Indicators (KPIs) of 5G and beyond, especially in
quality loss.
terms of low latency and energy efficiency. Similar to MEC systems,
Our Contributions: The main objective of this paper is to design fog computing networks are proposed by CISCO systems to bring cloud
the QLRan algorithm, optimizing the trade-off between the application services to the edge of an enterprise network [12]. In fog networks, the
completion time and QLR cost. The main contributions of this paper computation processing is mainly executed in the local area networks
are summarized as follows. and in IoT gateways or fog nodes. Recently, the concept of NG-RAN has
• Subject to transmission and processing delays, quality loss, and been defined by 3GPP as a promising approach to merge edge cloud
computing capacity constraints, we formulate and analyze math- features and RAN functionaries. In industry, many RAN organizations
ematically the QLRan optimization problem in NG-RAN as a have made significant progress in implementing open source-software
Mixed Integer Nonlinear Program (MINLP) that jointly optimizes that supports NG-RAN technology. For instance, EURECOM has imple-
the computational task allocation and QLR levels. The problem mented the OpenAirInterface (OAI) platform [9], which provides an
formulation and analysis trade off optimizing the service latency open, full software implementation of 5G and beyond systems com-
and the overall quality loss. pliant with 3GPP standards under real-time algorithms and protocols.
• The QLRan optimization problem is proved as a non-deterministic Plus, ORAN [13], founded by AT&T, aims to drive the mobile industry
polynomial-time hard (NP-hard) problem. To solve the problem towards an ecosystem of innovative, multi-vendor, interoperable, and
efficiently, we first relax the binary computation offloading de- autonomous NG-RAN with reduced cost, improved performance, and
cision variable and QLR level to real numbers. Then, we utilize greater agility. In general, these open RAN-software projects have a
the Linear Programming (LP)-based method to solve the relaxed high degree of flexibility, such as being able to run CU and DU entities
QLRan problem by using convex optimization techniques. over a fully virtual environment such as VMs or Linux containers, as

108
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117

Fig. 1. Logical diagram for uplink/downlink of gNB with eight functional split options.

well as enabling promising next-generation features (e.g., network slic- Table 1


Summary of key notations.
ing and functional splitting). Such NG-RAN software will undoubtedly
speed up the transition from monolithic and inflexible networks to Symbol Description

agile, distributed elements depending on virtualization, softwarization,  Set of UEs


 Set of edge nodes
openness, and intelligence-fully interoperable RAN components.
 Set of computational tasks
𝑎𝑢𝑘 Indicator to show whether the task 𝑘 is generated by UE 𝑢
2.2. Task offloading in cloud-based RANs 𝑎𝑢𝑠 Indicator to show whether edge node 𝑠 is available for UE 𝑢
𝑎𝑠𝑘 Indicator to show whether task 𝑘 is assigned to the edge node 𝑠
𝑞𝑘 QLR level assigned to task 𝑘
As part of task offloading in cloud-based RAN, several papers have 𝐷𝑢 (𝑞𝑘 ) Input data transfer the computing task 𝑘 from UE 𝑢 to the edge
focused on enhancing overall system performance in network energy, 𝐶𝑢 (𝑞𝑘 ) Workload of computation to accomplish the task 𝑘
system latency, and energy efficiency. For instance, the work in [14] 𝑅𝑢𝑠 Transmission data rate of the link between edge node 𝑠 and UE 𝑢
formulates a joint task offloading and resource allocation to maximize 𝜏𝑘𝑢𝑝 Uplink transmission time
𝜏𝑘𝑒𝑥𝑒 Execution time of task 𝑘 at the edge
the users’ task offloading gains in MEC. Then the main optimization 𝑓𝑢𝑠 Assigned CPU-clock frequency on edge 𝑠 of UE 𝑢
problem has been decomposed into several sub-optimal problems that 𝐵(𝑞𝑘 ) Computing demanded from task 𝑘 with QLR level
are solved using convex and quasi-convex optimization techniques. 𝛿𝜏 Weight of latency consumption time for task 𝑘
The authors in [15] study the energy-latency tradeoff problem for 𝛿𝑞 Weight of QLR level for task 𝑘

IoT partial task offloading in the MEC network by jointly optimizing


the local computing frequency, task splitting, and transmit power.
Then, the optimization is solved by an alternate convex search-based 3.1. NG-RAN functional split options
algorithm. In [16], by considering a cloud–fog computing network,
the authors design a computation offloading algorithm to minimize
As a part of the NG-RAN study, 3GPP proposed several functional
total cost with respect to the energy consumption and offloading la-
splits between CUs and DUs. Accordingly, it has been proposed 8
tency. To maximize the energy efficiency of task offloading, Vu et al.
possible options shown in Fig. 1 [18]. The choice of how to split
propose an approach based on the interior point method and bound
the NG-RAN architecture depends on several factors related to ra-
algorithm. Exploiting machine learning methods in task offloading has
dio network status, traffic size and network providers’ services, such
also attracted several types of research in cloud-based RAN systems.
as low latency, high throughput, UE density, and the geographical
Using reinforcement learning, the work in [17] introduces a MEC-
location of DUs. By moving from Option 1 to Option 8, a tradeoff
based blockchain network where multi-mobile users act as miners to
can be established between fronthaul latency and processing complex-
offload their data processing and mining tasks to a nearby MEC server
ity. Basically, by adding more baseband functions at the DUs, the
via wireless channels. Although the focus of our article is in the line
required fronthaul rate can be reduced, while the processing complex-
direction of mentioned works, applying different offloading schemes
ity will be increased, and the energy consumption at the DUs will
and constraints within the joint optimization NG-RAN framework could
open up new, interdisciplinary avenues for researchers in the context be increased [19]. Specifically, computationally costly operations like
of the 5G and beyond systems. Previously mentioned works consider Fast Fourier Transformation (FFT), Inverse Fast Fourier Transforma-
a single remote server as the offloading destination. In contrast, with tion (IFFT), Rate Matching, and Turbo encoding/decoding are shifted
considering constraints on service latency, quality loss, and edge capac- to the CU side, resulting in variation in energy consumption at the
ity, our paper proposes an algorithmic approach for latency and quality CU and the DU. It is worth noting that, in an actual NG-RAN testbed
tradeoff task offloading in multi-node NG-RANs. Furthermore, our work implementation, both the CU and DU can be implemented through
is based on real-world NG-RAN testbed experiments that allow us to virtualization. For example, in the OAI platform, each CU can be
characterize the performance in terms of data input, memory usage, initialized by one container image and linked to a DU container image.
and average processing time with respect to QLR levels. We will provide the functional spilled-based testbed design in detail to
provide real-time NG-RAN experiments.
3. System overview
3.2. Task allocation process
We describe here the functional split options and introduce the
task allocation procedure for NG-RAN. Table 1 summarizes the key The main process of the task allocation in our proposed NG-RAN
notations used. system can be summarized as follows:

109
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117

1. Edge cloud nodes: Initially, a UE searches its communication area end-users to take advantage of the tradeoff between QLR and ser-
for the best edge cloud node to connect to. Hence, the UE will vice latency. For instance, many object recognition algorithms basi-
send a pilot signal and collects response from edge cloud nodes. cally demand specific extraction methods of several numbers of lay-
Any edge cloud node that responds will be considered to be ers with given wavelengths and orientations from image datasets for
a potential candidate. For instance, in Fig. 2, the edge cloud advanced analysis [20]. Hence, the achieved QLR managing the pro-
candidate of the UE is the edge node within the coverage area cessing time in the object recognition can be relaxed if the number
of LTE eNB DU. of extracted layers are properly adapted. Another example is multi-
2. Task classification: After the edge cloud node assignment, the UE bitrate video streaming, in which the Over-The-Top (OTT) video con-
starts uploading the task information to the edge node. Some tent providers (e.g., YouTube, Amazon Prime, Netflix, . . . ) offer to
key information include; (i) the unique ID of the uploading task; end-users different video quality levels to fit within the device’s display
(ii) the application’s layers and requirements; (iii) the task pro- and network connection [6,21]. Adjusting the video quality levels can
file, which include the task constraints (e.g., tolerable latency, save extra computational energy and time for OTT video providers at
QLR level, workload). the same time make users experience good video watching without
3. Task executing: After task classification, the RAP will run a interruption. In this paper, we denote 𝑞𝑘 as QLR level assigned to task 𝑘.
resource allocation algorithm to determine: (i) the service time Hence, we allow each UE 𝑢 to select different 𝑞𝑘 values to exploit the
required for task accomplishment; (ii) computing capacity that is trade-off between processing cost and latency. We define QLR as five
available for the task executing; (iii) the compare these estimates levels in which level 1 refers to the strictest demand for quality, while
to the tasks’ tolerated latency requirement. level 5 represents the highest tolerance for quality loss. In practice, QLR
levels are determined at an application-specific level.
4. System model

4.3. Task uploading


In this section, we describe the network setting, quality loss of result
tradeoff, and task uploading model.
The computation task uploading in NG-RAN system can be de-
4.1. Network description scribed as a tuple of two parameters, ⟨𝐷𝑢 (𝑞𝑘 ), 𝐶𝑢 (𝑞𝑘 )⟩, where 𝐷𝑢(𝑞𝑘 )
[bits] represents the amount of input data required to transfer the
application execution (including system settings, application codes,
For the NG-RAN system model, we consider a multi-cell, multi-
and input parameters) from the local device to the edge node, and
node edge system as illustrated in Fig. 2, in which each RAP (e.g., BS,
𝐶𝑢 (𝑞𝑘 ) [cycles] denotes the workload, i.e., the amount of computation
eNodeB (eNB), gNodeB (gNB), etc.) engages with a set  = {1, 2, … , 𝑆}
of 𝑆 edge nodes (e.g., neighboring DU servers) to supply computa- to accomplish the task. Each UE 𝑢 ∈  has one computation task at
tion offloading services to the limited-resource mobile devices such as a time that is atomic and cannot be divided into subtasks. The values
smartphones, tablets, and IoT devices. Specifically, each edge cloud of 𝐷𝑢 (𝑞𝑘 ) and 𝐶𝑢 (𝑞𝑘 ) can be obtained through carefully profiling of the
node can be realized either by a physical server, or by Virtual Ma- task execution [14].
chine (VM)/container, which can communicate with the UE through In Section 6, we will provide more details about the modeling of
wireless channels provided by the corresponding RAP. Plus, each UE these metrics. Besides, the computation task associated with UE can be
can select to offload its computation task to an edge node from the executed locally or offloaded to an edge cloud node. The Mobile device
candidate nearby severs. Accordingly, we denote the set of UEs in the would save battery bower by offloading part of its task application to
mobile system and the set of computation tasks as  = {1, 2, … , 𝑈 } the remote edge; however, a considered cost, time and energy, from
and  = {1, 2, … , 𝐾}, respectively. To define the association between uploading the input data would be added in the task offloading sce-
UEs and RAPs, we define two binary indicators as follows, 𝑎𝑢𝑘 ∈ {0, 1} nario. Therefore, similar to [14], we consider several time parameters
is presented to indicate whether the task 𝑘 is generated by UE 𝑢, while in the case of the UE 𝑢 offloads its task 𝑘 to one of the edge nodes,
𝑏𝑢𝑠 ∈ {0, 1} is presented to indicate whether edge node 𝑠 is available for the overall uploading time delay consists of the follows: (i) the time
UE 𝑢 (i.e., the edge node 𝑠 has an acceptable channel state condition to 𝜏𝑘𝑢𝑝 [s] to transmit the input to the edge node on the uplink, (ii) the time
be in the list of edge candidate). Hence, 𝜏𝑘𝑒𝑥𝑒 to perform the computation task at the edge node, and (iii) the
{ { time to bring the output data from the edge node back to UE on the
1, 𝑘 ∈ 𝑢 1, 𝑠 ∈ 𝑢
𝑎𝑢𝑘 = , 𝑎𝑢𝑠 = , (1) downlink. In general, the size of the output data is much smaller than
0, Otherwise 0, otherwise the input data, and the downlink data rate is much higher than that
where 𝑢 ⊆  is denoted as the set of edge candidates for UE 𝑢, and of the uplink. Therefore, similar to [14,22,23], we neglect the delay
𝑢 ⊆  is defined as the set of tasks generated by UE 𝑢. Thus, from (1), of sending the output in our computation model. Note that when the
we can denote 𝑎𝑠𝑘 as a binary variable to indicate whether task 𝑘 is delay of the downlink transmission of output data is non-negligible, our
assigned to edge node 𝑠 or not. If the edge node 𝑠 is available for UE 𝑢, proposed approach can still be directly applied for a given downlink
the task 𝑘 will be successfully assigned to the edge node 𝑠. Hence, 𝑎𝑠𝑘 rate allocation scheme and known output data size. The transmission
will be satisfy the following requirement, time of UE 𝑢, that is required to send its task data input 𝐷(𝑞𝑘 ) in the
uplink, can be determined as,
𝑎𝑠𝑘 ≤ min{𝑎𝑢𝑘 , 𝑎𝑢𝑠 }, ∀𝑢 ∈  , 𝑘 ∈ , 𝑠 ∈ . (2)
𝐷𝑢 (𝑞𝑘 )
𝜏𝑘𝑢𝑝 = , ∀𝑢 ∈  , 𝑘 ∈ , 𝑠 ∈ , (3)
The modeling of user computation tasks, task uploading transmissions, 𝑅𝑢𝑠
edge computation resources, and offloading utility are presented here
where 𝑅𝑢𝑠 is the transmission data rate of the link between the selected
below.
edge node 𝑠 and UE 𝑢. Given the computing resource assignment, the
execution time of task 𝑘 at edge node 𝑠 is,
4.2. Quality loss of result tradeoff
𝐶𝑢 (𝑞𝑘 )
𝜏𝑘𝑒𝑥𝑒 = , ∀𝑢 ∈  , 𝑘 ∈ , 𝑠 ∈ , (4)
Many emerging applications in cloud-based computing networks 𝑓𝑢𝑠
(e.g., online recommender, video streaming, object recognition, and where 𝑓𝑢𝑠 denotes the assigned CPU-clock frequency on edge 𝑠 to UE 𝑢
image processing) exhibit variant optional parameters that authorize of task 𝑘.

110
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117

Fig. 2. System overview of QLRan, in which the gray circle represents the communication range of the RAP.

4.4. System constraints computation capacity generated from processing task 𝑘 at QLR
𝑞𝑘 . Hence, the capacity constraint is model as,
We now introduce the following four constraints to capture the ∑
𝐵(𝑞𝑘 )𝑎𝑠𝑘 ≤ 𝐵𝑠max , ∀𝑠 ∈ . (7)
features of a task offloading multi-node NG-RAN system. 𝑘∈

1. QLR constraint: As we will describe in 4.2, 𝑞𝑘 can be modeled


5. Problem formation
based on a specific key metric in an application. In our scenario,
we adopt the video resolution level as 𝑞𝐾 in the video streaming In this section, we mathematically formulate the QLRan optimiza-
application. Under these considerations, which will be described tion problem, which optimizes the trade-off between the service latency
in more details in Section 6 the QLR constraint for the task 𝑘 is and quality loss while offloading tasks in NG-RAN edge nodes. Due to
defined as, 𝑞𝑘 = {1, 2, 3, 4, 5}, ∀𝑘 ∈  the intractability of the problem and the need for a practical solution,
2. Task association constraint: We assume each computation task of we then present a step-by-step solution based on a linear programming-
the UE must be assigned to one edge cloud node. Hence, the based solution, which is employed to transform the QLRan problem into
offloading policy would satisfy the task association constraint, a convex optimization problem.
expressed as,
∑ 5.1. Latency and quality tradeoffs problem
𝑎𝑠𝑘 = 1, ∀𝑘 ∈ . (5)
𝑠∈
For a given  = {𝑎𝑠𝑘 |𝑠 ∈ , 𝑘 ∈ }, the set of selected edge nodes,
3. Service latency constraint: In many graphic applications with and  = {𝑞𝑘|𝑘 ∈ }, the set of selected QLR levels, we define the
multiple tasks, the reduction of computation workload at the system utility as the weighted-sum of all the UEs’ offloading utilities,
edge node considerably affects the task execution latency. For ∑
instance, real-time gaming applications have a preferred re- 𝐽𝑘 (, ) = 𝛿 𝜏 𝜏𝑘 + 𝛿 𝑞 𝑞𝑘 𝑎𝑠𝑘 , ∀𝑠 ∈ , 𝑘 ∈ , (8)
𝑠∈
sponse time around 50 ms latency to enjoy a higher Quality
( )
of Experience (QoE) [24]. Achieving appropriate latency for a where 𝜏𝑘 = 𝜏𝑘𝑢𝑝 + 𝜏𝑘𝑒𝑥𝑒 , 0 ≤ 𝛿 𝜏 ≤ 1 and 0 ≤ 𝛿 𝑞 ≤ 1 denote the weights
graphic video application demands tradeoffs processing time, of latency consumption time and QLR levels for task 𝑘, respectively.
uploading time, and quality. In this paper, we denote parameter Note that we define the latency and quality tradeoffs utility, 𝐽𝑘 (, )
𝜏𝑘max to define the maximum tolerable system latency for the task of task 𝑘 as a linear combination of the two metrics because both of
𝑘. To guarantee that the task is accomplished in the allowed them can concurrently reflect the latency-quality tradeoff of executing
threshold time, the service latency constraint is expressed as, a task, i.e., both higher longer computation completion time and high
accuracy of result lead to higher computational cost. To meet task-
𝜏𝑘𝑢𝑝 + 𝜏𝑘𝑒𝑥𝑒 ≤ 𝜏𝑘max , ∀𝑘 ∈ . (6) specific demands, we allow different UEs to select different weights,
which are denoted by 𝛿 𝜏 and 𝛿 𝑞 , in decision making. For example, a UE
4. Resource constraint: In multi-node NG-RAN with intensive work-
with low accuracy application demand would like to choose a larger
loads, the computation capacity should be taken into account
𝛿 𝑞 to save more computational cost. On the other hand, when a UE
while designing a latency-quality optimization algorithm. The
is running some delay-sensitive applications (e.g., online movies), it
computation capacity could refer to several hardware metrics
may prefer to set a larger 𝛿 𝜏 to reduce the latency. We now formulate
such as GPU, CPU, and memory. Adjusting these parameters
the Latency and Quality Tradeoffs (QLRan) problem as a system utility
is directly affected by the service latency and the required
minimization problem, i.e.,
quality. However, the computational processing capacity at the

edge cloud node cannot exceed its limited capacity. Therefore, 1 ∶ min 𝐽𝑘 (, ) (9a)
,
we present the parameter, 𝐵𝑠max , as the maximum computation 𝑘∈
capacity of edge node 𝑠, while 𝐵(𝑞𝑘 ) is defined as the require s.t. ∶

111
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117

𝑎𝑠𝑘 ∈ {0, 1}, 𝑞 ∈ {1, 2, 3, 4, 5}, ∀𝑠 ∈ , 𝑘 ∈ , (9b) 0 ≤ 𝑎𝑠𝑘 ≤ 1, 1 ≤ 𝑞𝑘 ≤ 5, 𝑡 ≤ 𝜏𝑘𝑚𝑎𝑥 , ∀𝑠 ∈ , 𝑘 ∈ , (11b)


∑ 𝑢𝑝
(𝜏𝑘 + 𝜏𝑘𝑒𝑥𝑒 )𝑎𝑠𝑘 ≤ 𝜏𝑘max , ∀𝑘 ∈ , (9c) 0 ≤ 𝑥𝑠𝑘 ≤ 5𝑎𝑠𝑘 , ∀𝑠 ∈ , 𝑘 ∈ , (11c)
𝑠∈
∑ 𝑞𝑘 − 5(1 − 𝑎𝑠𝑘 ) ≤ 𝑥𝑠𝑘 ≤ 𝑞𝑘 ∀𝑠 ∈ , 𝑘 ∈ , (11d)
𝐵(𝑞𝑘 )𝑎𝑠𝑘 ≤ 𝐵𝑠max , ∀𝑠 ∈ , (9d) ( ) ( )
∑ 𝑦𝑑 𝑦 𝑧𝑑 𝑧
𝑘∈
∑ + 𝑡 𝑥𝑠𝑘 + + 𝑡 𝑎𝑠𝑘 ≤ 𝜏𝑘max , (11e)
𝑠∈
𝑅𝑢𝑠 𝑓𝑢𝑠 𝑅𝑢𝑠 𝑓𝑢𝑠
𝑎𝑠𝑘 = 1, ∀𝑘 ∈ . (9e) ∑
𝑠∈ 𝑎𝑠𝑘 = 1, ∀𝑘 ∈ . (11f)
The constraints in the formulation above can be explained as fol- 𝑠∈

lows: constraint (9b) secure that the computation task can be accom-
plished in the time that cannot exceed than the demanded maximum Proposition 2. Constraints (11c) and (11d) can be relaxed to the
threshold time, 𝜏𝑘max ; constraint (9c) implies that the demand for com- constraint 𝑥𝑠𝑘 = 𝑎𝑠𝑘 𝑞𝑘 .
putation capacity must not exceed its edge node capacity; finally,
constraint (9d) indicates that each task must be assigned as a whole Proof. Case 1: (𝑎𝑠𝑘 = 0, and 𝑞𝑘 ∈ [1, 5]). Form constraints (11c) and
to one edge node. (11d), we can conclude the follows,

𝑥𝑠𝑘 ≤ 0, 𝑥𝑠𝑘 ≥ 0, and 𝑥𝑠𝑘 ≤ 𝑞𝑘 , 𝑥𝑠𝑘 ≥ 𝑞𝑘 − 5, (12)


Proposition 1. 1 is an NP-hard problem.
After solving (12), we can get 𝑥𝑠𝑘 = 0.
Proof. To demonstrate that 1 is an NP-hard, let first consider the Case 2: (𝑎𝑠𝑘 = 1, and 𝑞𝑘 ∈ [1, 5]).
case, where 𝛿 𝜏 = 0, 𝛿 𝑞 = 1. That means the time spent for uploading 𝑥𝑠𝑘 ≤ 5, 𝑥𝑠𝑘 ≥ 𝑞𝑘 , and 𝑥𝑠𝑘 ≥ 𝑞𝑘 , 𝑥𝑠𝑘 ≥ 0, (13)
and executing a task is neglected for this case and the focusing is done
only on the second part of 𝐽𝑘 (, ), where the QLR term is important. From (13), we can conclude 𝑥𝑠𝑘 𝑞𝑘 = 𝑞𝑘 . From Case 1 and Case 2, we
We assume that 𝑞̂𝑘 represents the opposite value of 𝑞𝑘 and denotes as demonstrate that the constraints (11c) and (11d) are equivalent to the
the quality level in the result of task 𝑘. Accordingly, we can reformulate constraint 𝑥𝑠𝑘 = 𝑎𝑠𝑘 𝑞𝑘 . The proof is complete. ■
the 1 as 1,̂ in which the new objective function 𝐽̂(, ) will be
maximized. Plus, constraint (9c) can be omitted for simplicity. Besides, 6. Performance evaluation
Constraint (9d) is rewritten to imply that the resource requirement of
task 𝑘 is exactly equal to its quality value 𝑞̂𝑘 . Each edge cloud node in In this section, we describe the testbed experiments and simulation
the NG-RAN system can only handle one task generated from the UE results to provide more details about the QLR level model in terms of
in the RAP coverage area. Let 𝑎̂𝑘 is defended as a binary indicator to memory and CPU usage, as well as to test the effectiveness of the QLRan
show whether the task 𝑘 is assigned to the edge node, and 𝐵 to denote algorithm.
the resource capacity of the edge node. With these considerations, the
optimization problem in (9) can be relaxed as, 6.1. Testbed experiment

̂ ∶ max
1 𝑞̂𝑘 𝑎̂𝑘 (10a) We present here our QLRan testbed, including the architecture, con-
𝑠∈ figuration, and experiment methods. Then, we analyze the performance

s.t. ∶ 𝑞̂𝑘 𝑎̂𝑘 ≤ 𝐵, (10b) of QLRan in terms of CPU processing time and latency.
𝑘∈

𝑎̂𝑘 ∈ {0, 1}. (10c) 6.1.1. Architecture


We conducted experiments on a testbed consisting of various com-
̂ is a standard weighted-sum problem that
It is obvious that problem 1 ponents, i.e.,
is an NP-complete problem [25]. Therefore, 1 also can be character-
ized as an NP-hard problem. The proof is completed. ■ • End users: For our experiment we use a Samsung Galaxy S9
running on Android 10 that acts as the UE.
Next, we will propose an iterative approach to solve 1 based on • Edge nodes: To simulate the edge node, we use Asus Laptop
Linear Programming-based (LP) optimization. By utilizing the standard equipped with an Intel Pentium III processor running Ubuntu
optimization solver (e.g., MOSEK [26]), the proposed system can gen- 18.04. The cloud is represented by the more powerful desktop
erate an efficient task allocation decision with an acceptable latency PC Intel Xeon E5-1650, 12-core at 3.5 GHz and 32 GB RAM.
tolerance constraint. • Network: The structure of OAI consists of two components: one,
called oai, is used for building and running gNB units; the other,
5.2. Linear programming-based solution called openair-cn, is responsible for building and running the
Evolved Packet Core (EPC) networks, as shown in Fig. 3. The
The key challenge in solving the optimization problem in 1 is Openair-cn component provides a programmable environment to
that the integer constraints 𝑎𝑠𝑘 ∈ {0, 1} and 𝑞 ∈ [1, 5] make 1 a implement and manage the following network elements: Mobility
MIP problem, which is in general non-convex and NP complete [27]. Management Entity (MME), Home Subscriber Server (HSS), Serv-
Thus, similar to works in [8,28], we first relax the binary computation ing Gateway (SPGW-C), and PDN Gateway (PGW-U). We use WiFi
offloading decision variable, 𝑎𝑠𝑘 , and QLR level, 𝑞𝑘 , to real numbers, as well as LTE to act as our physical link between the UE and
i.e., 0 ≤ 𝑎𝑠𝑘 ≤ 1. Then we will discuss the convexity of 1 with the edge. The edge is connected to the cloud through Ethernet.
the relaxed optimization variables 𝑎𝑠𝑘 and 𝑞𝑘 . Then, we consider the As illustrated in Fig. 3, all the EPC and gNB components are
following; 𝐷(𝑞𝑘 ) = 𝑦𝑑 𝑞𝑘 + 𝑧𝑑 , 𝐶(𝑞𝑘 ) = 𝑦𝑡 𝑞𝑘 + 𝑧𝑡 , 𝐵(𝑞𝑘 ) = 𝑦𝑏 𝑞𝑘 + 𝑧𝑏 , and implemented by as container image by using Docker and docker
𝑥𝑠𝑘 = 𝑞𝑘 𝑎𝑠𝑘 . The parameters 𝑦𝑑 , 𝑧𝑑 , 𝑦𝑡 , 𝑧𝑡 , 𝑦𝑏 , and 𝑧𝑏 can be estimated compose [29]. The UE and RF RAN are implemented in hardware,
by offline profiling of the NG-RAN testbed, as detailed in Section 6. The conventional cell phone and USRP 210, respectively.
LP problem for the primal problem is given by,
∑ 6.1.2. NG-RAN testbed for different functional options
2 ∶ min 𝛿 𝜏 𝑡 + 𝛿 𝑞 𝑥𝑠𝑘 (11a) We endowed our testbed with several functional split options so as
,,,𝑡
𝑠∈ to realize the CU and DU in gNB. All containers in Fig. 3 are hosted
s.t. ∶ by the desktop PC Intel Xeon E5-1650, 12-core at 3.5 GHz and 32 GB

112
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117

Fig. 3. Logical illustration of the fully containerized-based NG-RAN testbed.

Fig. 4. (a) CPU utilization vs. number of PRBs for DU and CU in Options IF1 and IF4.5; (b) Memory usage vs. number of PRBs for DU and CU in Options IF1 and IF4.5.

RAM. For the UE, we use a Samsung Galaxy S9 running on Android That is because the higher PHY operations such as RLC/MAC, L1/high,
10. For network configuration, we run our NG-RAN prototype for three tx precode, rxcombine, and L1/low operations reside in DU for split
functional splits; Option F1, (PDCP/RLC, Option 2 in 3GPP TR 38.801 Option IF1 [30]. In Option IF4.5, the pattern is reversed. We can see
standard), Option IF4.5 (Lower PHY/Higher PHY, a.k.a Option 7.x in from Fig. 4(a) that the CPU usage at CU is higher than at DU. Fig. 4(c)
3GPP TR 38.801 standard), and Option LTE eNB. We summarize the shows the memory usage of DU and CU containers when the NG-RAN
testbed configuration parameters in Table 2. testbed performs in Options IF1 and IF4.5 at different values of PRBs.
Fig. 4(a) shows the CPU utilization percentage at DU and CU Similar to the CPU consumption pattern, the memory usage at DU is
containers. The CPU utilization percentage is measured by the docker higher than at DU in Option IF1. For example, the memory usage is
stats command in Ubuntu, which provides a live data stream for running 388 MB in DU while it is 145.3 MB in CU at Option IF1 and 25 PRB.
containers. The downlink UDP traffic repeatedly is sent from the SPGW-
U container to the UE with various PRB settings in two functional split
Options, F1 and IF4.5. It can be observed that the CPU consumption 6.2. Application profiling
for DU and CU is continuing to increase linearly as the number of PRBs
is increased in the two functional split options. However, Option IF1 To test QLRan, we consider two applications: video streaming and
consumes more CPU percentage at DU than at the CU. For example, the facial detection in smart surveillance cameras. These two tasks are both
CPU utilization percentage is 43.67% in DU while it is 14.42% in CU. video-based tasks that require varying degrees of quality.

113
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117

Fig. 5. (a) Memory usage for various QLR levels in video streaming; (b) CPU usage for various QLR levels in video streaming.

Fig. 6. (a) Relation between a video’s bitrate and CPU consumption in video streaming; and (b) Latency in facial recognition.

Table 2 resolution of 1920 × 1080, 30 f ps as well as 60 f ps is used as well


Testbed configuration parameters for gNB. as a stereographic stream for 60 f ps for potential 3D reconstruction
Mode FDD Options IF1, IF4.5, eNB applications.
Frequency 2.68 GHz PRB 25, 50, 100
Facial recognition application: In addition to network streaming,
TX Power 150 dBm Env. Multi-container
MCS 28 SINR 15 − 20 a basic facial detection and recognition application is tested against
the very same resolutions in the network stream. The facial recognition
algorithm is based on the popular and simple dLib library available for
Table 3 python [31].
Configuration parameters for simulation.
For both applications, we have chosen QLR 1 to represent the best
𝑦 𝑑 , 𝑧𝑑 4.3, 2.75 Capacity [GB] 1.5
networking conditions while a QLR of 5 represents the worst network
𝑦𝑡 , 𝑧𝑡 −5.24 , 3.31 𝛿 𝜏 ∕𝛿 𝑞 [50, 100, 150]
𝑦 𝑏 , 𝑧𝑏 −10.41 , 95.9 Data rate [Mbps] 2 conditions. Using the top utility, we were able to log in 1 second
𝑈 , 𝐾, 𝑆 10, 10, 20 Delay tolerance [ms] 300 intervals the CPU consumption as well as the memory consumption
𝐵𝑠max [GB] 3 QLR [1, 2, 3, 4, 5] of the process on the server streaming the video. Note that in both
Fig. 5(a) and (b) we witness a linear increase in both memory and CPU
consumption which can be expressed in the following equations,

Video streaming application: Video streaming is run on two Dell 𝐵(𝑞𝑘 ) = −10.4𝑞𝑘 + 95.9, 𝐶(𝑞𝑘 ) = −5.2𝑞𝑘 + 33.3, ∀𝑘 ∈ . (14)
Workstations, each with two Xeon E6-1650 processors. Each worksta-
In Fig. 6(a), since we downsampled the video resolutions ourselves,
tion is equipped with 32 GB of RAM running Ubuntu 18.04. In our we are able to extract the exact average bitrate for various stream
experiments, a prerendered movie of one minute is streamed between profiles to arrive at an equation,
these two computers using ffmpeg, a video transcribing and streaming
application. On the other end, a ffplay is used to receive and render 𝐷(𝑞𝑘 ) = 4.30𝑥 + 2.75, ∀𝑘 ∈ , (15)
the video stream. Four different video resolutions are used: 360 × 240, where 𝑥 represents the achievable bit rate in Mbps. Similarly—as
480 × 360, 960 × 720, and 1920 × 1080. Additionally for the highest shown in Fig. 6(b)—as video resolutions increase in facial recognition

114
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117

Fig. 7. System latency performance versus: (a) QLR levels; and (b) Computing capacities.

application, so does processing time. Hence, the QLR processing time node, the computational complexity at QLRan will increase as well as
can be modeled as, the system latency. The system latency decreases with the tolerance
of quality becoming low. Plus, QLRan shows good performance when
𝑇 𝑝𝑟𝑜𝑐 = −0.08𝑞𝑘 + 0.51, ∀𝑘 ∈ . (16)
the algorithm is acting towards the latency performance. For instance,
when the QLR level increases to 4, the average system latency of QLRan
6.3. Numerical result turns down to 220 ms, 200 ms, and 180 ms, for 𝛿 𝑡 ∕𝛿 𝑞 = 50, 100, and 150,
respectively.
We consider a NG-RAN system consisting of 100 m × 100 m cell with
a RAP in the center. The mobile devices, 𝑁 = 25, are randomly located
6.3.2. Impact of computing capacity of edge cloud node
inside the cell. The channel gains are generated using a distance-
To evaluate the offloading performance in term of memory usage,
dependent path-loss model given as 𝐿 [dB] = 140.7 + 36.7 log10 𝑑[km] ,
𝐵(𝑞𝑘 ), We run the QLRan algorithm for different values of computing
where 𝑑 is the distance between the mobile device and the BS, and
capacity 𝐵(𝑞𝑘 ) at 𝛿 𝑡 ∕𝛿 𝑞 ratio of 50, 100, and 150. We observe that as
the log-normal shadowing variance is set to 8 dB. The other network
long as the memory requirements are sufficient, the computing capacity
parameter values are listed in Table 3.
(CPU/GPU) requirements can be satisfied. Hence, the performance can
In general, the computational tasks can be classified into two differ-
be evaluated with several memory sizes. As mentioned in Table 3, We
ent categories: (i) approximatable, tasks that can be approximated to set the memory size of a edge node to 𝐵𝑠max = 1.5 GB by default,
achieve significant savings in execution time, with however a potential while the ratio of 𝛿 𝑡 ∕𝛿 𝑞 = 50, 100, and 150 are tuned to measure the
loss of accuracy in the result; and (ii) non-approximatable, tasks whose system latency and QLR with several memory capacity values. Also,
execution without any approximation is necessary for the success of the memory size of each edge node is tuned from 0.5 to 2 GB. As
the application, i.e., if any approximation technique were applied on illustrated in Fig. 7(b), the system latency decreases when the memory
these tasks, the application would not generate meaningful results. We capacity of the QLRan algorithm increases. Specifically, the service
refer the interested readers to the work in [32], which introduces a latency decreases by around 12% from tuning the 𝛿 𝑡 ∕𝛿 𝑞 ratio from 50 to
lightweight online algorithm that selects between these tasks to enable 100 at computing capacity is 0.5 GB, while the overall pattern continues
real-time distributed applications on resource-limited devices. Accord- to decrease as the computing capacity value is increased.
ingly, we consider video streaming and facial recognition applications,
which can be considered as approximatable tasks, for profiling. The rea-
6.3.3. Impact of increasing number of tasks
son for choosing these task applications is that they can highly benefit
For the computation task, we use the face detection and recognition
from the collaboration between mobile devices and edge platforms. In
application for airport security and surveillance [33], which can highly
experimental results, we study the impact of the difference of service
benefit from the collaboration between mobile devices and edge plat-
quality level, which can be considered as the resolution level of video
forms. The setting value of 12 computational tasks are selected to be
streaming and facial applications, on the system latency and edge node
in the range of 90 and 250 kB for the data size and between 890 and
computing capacity.
1150 Megacycles for the CPU cycles. Fig. 8(a) shows the performance
of different schemes versus the number of tasks. In this figure, the
6.3.1. Impact of control parameters 𝛿 𝑡 and 𝛿 𝑞 parameter of task data input is a random variable following linearity
We discussed the definitions of the scalar weights 𝛿 𝑡 and 𝛿 𝑞 in Sec- increasing with QLR levels. It can be seen that the case 𝛿 𝑡 ∕𝛿 𝑞 = 150 has
tion 5. In general, these parameters are used to make a tradeoff between less latency cost compared to the other.
the service latency and quality. Specifically, When 𝛿 𝑡 ∕𝛿 𝑞 is increased,
the QLRan algorithm will be more sensitive to system latency; other-
6.4. Comparing QLRan with other baseline approaches
wise, it will be the quality of result sensitive. Fig. 7(a) shows that the
latency cost decreases with a larger QLR parameter for different values
We compare the QRan algorithm with the following existing bench-
of 𝛿 𝑡 ∕𝛿 𝑞 , which are 50, 100, and 150. Specifically, the average system
marks:
latency value at the 𝛿 𝑡 ∕𝛿 𝑞 ratio is 50 and the QLR level is 1 is around
300 ms, while the average system latency values are 275 ms and 250 ms • Cloud Edge Executing Only (CEO): Each UE 𝑢 ∈  has only one
for QLR level is 1 and 𝛿 𝑡 ∕𝛿 𝑞 ratio are 100 and 150, respectively. That is option: to offload its task to cloud edge node within its communi-
because when QLR level is 1, which refers to the best accuracy that can cation coverage without considering the tradeoff between latency
be obtained from processing the computational task in the edge cloud and approximate computing;

115
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117

Fig. 8. (a) System latency performance versus number of computational tasks; and (b) System latency versus number of computation tasks under different execution schemes.

• Local Executing Only (LEO): Each UE 𝑢 ∈  has only one option: References
to execute its task locally within its communication coverage
without considering the tradeoff between latency and approxi- [1] A. Younis, B. Qiu, D. Pompili, QLRan: Latency-quality tradeoffs and task
offloading in multi-node next generation RANs, in: Proc. IEEE/IFIP WONS, 2021,
mate computing;
pp. 1–8.
• Latency-aware ask Offloading (LO): Each UE 𝑢 ∈  can offload [2] A. Younis, T. Tran, D. Pompili, Energy-efficient resource allocation in C-RANs
its task to edge cloud within its communication coverage. Here, with capacity-limited fronthaul, IEEE Trans. Mob. Comput. 20 (2) (2021)
only the latency is considered in the objective function, while 473–487.
[3] I.A. Alimi, A.L. Teixeira, P.P. Monteiro, Toward an efficient C-RAN optical
approximate computing is ignored.
fronthaul for the future networks: A tutorial on technologies, requirements,
challenges, and solutions, IEEE Commun. Surv. Tutor. 20 (1) (2017) 708–769.
As illustrated in Fig. 8(b), we evaluate the running performance of 12
[4] A. Younis, T.X. Tran, D. Pompili, Fronthaul-aware resource allocation for energy
computational tasks under different offloading schemes. Our joint la- efficiency maximization in C-RANs, in: Proc. IEEE ICAC, 2018, pp. 91–100.
tency and quality-aware offloading scheme outperforms other schemes. [5] 3GPP TS 38.300 V2.0.0, NR; NR and NG-RAN overall description; Stage 2, 2017,
Specifically, the performance gap between QLRan and other schemes Release 15.
increases when the number of task increases. That is because the QLRan [6] A. Younis, T.X. Tran, D. Pompili, On-demand video-streaming quality of experi-
ence maximization in mobile edge computing, in: Proc. IEEE WoWMoM, 2019,
algorithm is designed to trade off between the latency and QLR level, pp. 1–9.
while the other schemes only focus on the offloading and executing [7] Y. Li, Y. Chen, T. Lan, G. Venkataramani, MobiQoR: Pushing the envelope of
scenarios. mobile edge computing via quality-of-result optimization, in: Proc. IEEE ICDCS,
2017, pp. 1261–1270.
[8] A. Younis, T.X. Tran, D. Pompili, Energy-latency-aware task offloading and
7. Conclusions approximate computing at the mobile edge, in: Proc. IEEE MASS, 2019, pp.
299–307.
[9] EURECOM, OAI, 2020, Available: http://www.openairinterface.org/.
We presented latency-quality tradeoffs and task offloading in multi- [10] H.T. Dinh, C. Lee, D. Niyato, P. Wang, A survey of mobile cloud computing:
node next-generation RANs. We designed our algorithm, QLRan, to Architecture, applications, and approaches, Wirel. Commun. Mob. Comput. 13
reduce system service latency while adjusting the overall quality level. (18) (2013) 1587–1611.
[11] S. Kekki, W. Featherstone, Y. Fang, P. Kuure, A. Li, A. Ranjan, D. Purkayastha,
Practical NG-RAN system constraints have been considered to formu-
F. Jiangping, D. Frydman, G. Verin, et al., MEC in 5G Networks, ETSI White
late the proposed task offloading problem. The constraints depend Paper 28, 2018, pp. 1–28.
on network latency, quality loss, and edge node computing capac- [12] CISCO, Fog Computing and the Internet of Things: Extend The Cloud to Where
ity, while the objective function is the weighted sum of all the UEs’ the Things Are, Whit Paper, 2015, pp. 1–6.
[13] O-RAN alliance, O-RAN Use Cases and Deployment Scenarios, White Paper, 2020.
offloading utilities. The QLRan is cast as an NP-hard problem; there-
[14] T.X. Tran, D. Pompili, Joint task offloading and resource allocation for multi-
fore, we propose a Linear Programming (LP)-based approach that can server mobile-edge computing networks, IEEE Trans. Veh. Technol. 68 (1) (2018)
be solved via convex optimization techniques. Simulation results are 856–868.
generated from running several real-time applications on the NG-RAN [15] M. Qin, N. Cheng, Z. Jing, T. Yang, W. Xu, Q. Yang, R.R. Rao, Service-
oriented energy-latency tradeoff for IoT task partial offloading in MEC-enhanced
testbed, which is completely implemented under container-based vir-
multi-RAT networks, IEEE Internet Things J. 8 (3) (2021) 1896–1907.
tualization and functional-split option technologies. We considered [16] Z. Zhao, S. Bu, T. Zhao, Z. Yin, M. Peng, Z. Ding, T.Q. Quek, On the design of
video-streaming and facial-recognition applications as building blocks computation offloading in fog radio access networks, IEEE Trans. Veh. Technol.
of many cloud-based applications. We evaluated our solution and thor- 68 (7) (2019) 7136–7149.
ough simulation results showed that the performance of the QLRan [17] D.C. Nguyen, P.N. Pathirana, M. Ding, A. Seneviratne, Privacy-preserved task
offloading in mobile blockchain with deep reinforcement learning, IEEE Trans.
algorithm significantly improves the network latency over different Netw. Serv. Manag. 17 (4) (2020) 2536–2549.
configurations. [18] 3GPP TR 38.801, Study of new radio access technology: Radio access architecture
and interfaces, 2017, Release 14.
[19] L. Wang, S. Zhou, Flexible functional split and power control for energy
Declaration of competing interest harvesting cloud radio access networks, IEEE Trans. Wirel. Commun. 19 (3)
(2019) 1535–1548.
[20] V. Kshirsagar, M. Baviskar, M. Gaikwad, Face recognition using eigenfaces, in:
The authors declare that they have no known competing finan-
Proc. IEEE ICCRD, 2011, pp. 302–306.
cial interests or personal relationships that could have appeared to [21] I. de Fez, R. Belda, J.C. Guerri, New objective QoE models for evaluating ABR
influence the work reported in this paper. algorithms in DASH, Elsevier Comput. Commun. 158 (2020) 126–140.

116
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117

[22] X. Chen, Decentralized computation offloading game for mobile cloud computing, Brian Qiu obtained a M.S. in the Electrical and Computer
IEEE Trans. Parallel Distrib. Syst. 26 (4) (2014) 974–983. Engineering (ECE) at Rutgers University, NJ, in 2021, where
[23] X. Lyu, H. Tian, C. Sengul, P. Zhang, Multiuser joint task offloading and resource he also received his B.S. in 2019. He is interested in
optimization in proximate clouds, IEEE Trans. Veh. Technol. 66 (4) (2016) distributed mobile computing and in general in networks
3435–3447. with a focus on anonymity and privacy.
[24] Y.W. Bernier, Latency compensating methods in client/server in-game protocol
design and optimization, in: Game Developers Conference, Vol. 98033, 2001.
[25] C.-P. Schnorr, M. Euchner, Lattice basis reduction: Improved practical algorithms
and solving subset sum problems, Math. Program. 66 (1–3) (1994) 181–199.
[26] MOSEK Aps, The MOSEK optimization toolbox v 9, 2019.
[27] E.D. Andersen, K.D. Andersen, Presolving in linear programming, Math. Program.
71 (2) (1995) 221–245.
[28] T.Q. Dinh, J. Tang, Q.D. La, T.Q. Quek, Offloading in mobile edge computing:
Task allocation and computational frequency scaling, IEEE Trans. Commun. 65 Dario Pompili is an associate professor with the Dept. of
(8) (2017) 3571–3584. ECE at Rutgers University. Since joining Rutgers in 2007,
[29] Docker, 2021, https://docs.docker.com/. he has been the director of the CPS Lab, which focuses
[30] OAI tutorials, 2021, https://gitlab.eurecom.fr/oai/openairinterface5g/blob/ on mobile edge computing, wireless communications and
develop/doc/FEATURE_SET.md#enb-phy-layer. networking, acoustic communications, and sensor networks.
[31] D.E. King, Dlib-ml: A machine learning toolkit, J. Mach. Learn. Res. 10 (2009) He received his Ph.D. in ECE from the Georgia Institute
1755–1758. of Technology in 2007. He had previously received his
[32] P. Pandey, D. Pompili, Exploiting the untapped potential of mobile distributed ‘Laurea’ (combined BS and MS) and Doctorate degrees in
computing via approximation, Pervasive Mob. Comput. 38 (2017) 381–395. Telecommunications and System Engineering from the U. of
[33] T. Soyata, R. Muraleedharan, C. Funai, M. Kwon, W. Heinzelman, Cloud- Rome ‘‘La Sapienza,’’ Italy, in 2001 and 2004, respectively.
vision: Real-time face recognition using a mobile-cloudlet-cloud acceleration He has received a number of awards in his career including
architecture, in: Proc. IEEE ISCC, 2012, pp. 59–66. the NSF CAREER’11, ONR Young Investigator Program’12,
and DARPA Young Faculty’12 awards. In 2015, he was
nominated Rutgers-New Brunswick Chancellor’s Scholar.
He served on many international conference committees
Ayman Younis received the B.Eng. and M.Sc. degrees taking on various leading roles. He published about 200
in Electrical Engineering from the U. of Basrah, Iraq, in refereed scholar publications, some of which received best
2008 and 2011, respectively. He is pursuing the Ph.D. paper awards: with more than 13K citations, Dr. Pompili
degree in ECE at Rutgers, NJ, USA, under the guidance has an h-index of 44 and an i10-index of 111 (Google
of Dr. Pompili. His research focuses on wireless commu- Scholar, Oct’21). He is a Fellow of the IEEE Communica-
nications and mobile cloud computing, with emphasis on tions Society (2021) and a Distinguished Member of the
software-defined testbeds. He received the Best Paper Award ACM (2019). He is currently serving as Associate Editor for
at the IEEE/IFIP Wireless On-demand Network Systems and IEEE Transactions on Mobile Computing (TMC).
Services Conference (WONS) in 2021.

117

You might also like