Latency and Quality-Aware Task Offloading in Multi-Node Next Generation
Latency and Quality-Aware Task Offloading in Multi-Node Next Generation
Computer Communications
journal homepage: www.elsevier.com/locate/comcom
1. Introduction using a generic-computing platform. Hence, the edge cloud node has
ability to execute the offloading applications in close proximity to
Motivation: Mobile platforms (e.g., smartphones, tablets, IoT mo- end users. In this way, the network end-to-end (e2e) latency and
bile devices) are becoming the predominant medium of access to the back/mid/fronth-haul cost will be reduced. Recently, Cloud Radio
Internet services due to a tremendous increase in their computation Access Network (C-RAN) [2] has been emerged as a clean-slate redesign
and communication capabilities. However, enabling applications that of the mobile network architecture in which parts of physical-layer
require real-time, in-the-field data collection and mobile platform pro- communication functionalities are decoupled from distributed, possibly
cessing is still challenging due to (i) the insufficient computing capa- heterogeneous, Radio Access Points (RAPs), i.e., BSs or WiFi hotspots,
bilities and unavailable aggregated/global data on individual mobile
and are then consolidated into a baseband unit pool for centralized
devices and (ii) the prohibitive communication cost and response time
processing. However, the centralized C-RAN design follows a ‘‘one size
involved in offloading data to remote computing resources such as
fits all" architectural approach, which makes it difficult to address the
cloud datacenters for centralized computation. In light of these lim-
wide range of Quality of Service (QoS) requirements and support dif-
itations, the edge computing term was introduced to unite telco, IT,
and cloud computing and provide cloud services directly from the ferent types of traffic [3]. Also, a fully centralized architecture imposes
network edge. In general, the edge cloud servers or nodes are usu- high capacity requirements on fronthaul links [4]. Therefore, Next
ally deployed directly at the mobile Base Stations (BSs) of a Radio Generation RANs (NG-RAN) [5] has been introduced as a resource-
Access Network (RAN), or at the local wireless Access Points (APs) efficient solution to address the above issues and reduce deployment
costs. It is worthy of note that, due to the flexibility of NG-RAN architecture,
✩ A preliminary/shorter version of this work appeared in the Proc. of the IEEE/IFIP Wireless On-demand Network Systems and Services Conference (WONS),
Mar’21 [1].
✩✩ This work was supported by the US NSF Grant No. ECCS-2030101.
∗ Corresponding author.
E-mail addresses: [email protected] (A. Younis), [email protected] (B. Qiu), [email protected] (D. Pompili).
https://doi.org/10.1016/j.comcom.2021.11.026
Received 30 July 2021; Received in revised form 27 October 2021; Accepted 29 November 2021
Available online 24 December 2021
0140-3664/© 2021 Elsevier B.V. All rights reserved.
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117
mobile network operators will have high degree of freedom to move from • We provide a set of tools to deploy the NG-RAN mobile network.
a ‘‘full centralization’’ in C-RAN to a ‘‘partial centralization’’ in NG-RAN To explore the virtualization in the 5G system, we assign sev-
with a specific functional splitting option to a ‘‘distributed approach’’ in edge eral OpenAirInterface (OAI) [9] containers composing of a RAN
cloud [6]—enabling rich services and applications in close proximity to the and the core of the 5G system. Specifically, we implement a
end users. programmable testbed to demonstrate a connection between UE,
Task offloading can enhance the performance of mobile devices RAN, and Evolved Packet Core (EPC) implemented in the NG-RAN
because servers in the edge cloud have higher computation capabilities virtualization environment. The real-time experiments are carried
than mobile devices. Therefore, enabling task offloading in NG-RAN out under various configurations in order to profile functional
is proposed to address the limitations (e.g., storage and computing splitting, the data input, memory usage, and average processing
resources) in the existing RANs. Meanwhile, in some cases, processing time with respect to QLR levels.
the entire input data in edge cloud servers would require more than the • We provide formal proofs on the convergence and optimality
available computing resources to meet the desired latency/throughput
of our algorithm and evaluate its performance under different
guarantees. In the context of NG-RAN applications (e.g., IoT, AR/VR),
network conditions. In terms of computing capacity and num-
transferring, managing, and analyzing large amounts of data in an edge
ber of tasks, the numerical results show that latency can be
cloud would be prohibitively expensive. Hence, the tradeoff between
reduced while decreasing the QLR level under practical physical
service latency and the tolerance of quality loss can improve key
constraints.
network performance metrics like the user’s QoS [7,8]. In this paper,
we define the Quality Loss of Results (QLR) term as the level of Paper Organization: The remainder of this article is organized
relaxing/approximating in data processing while the user’s QoS is still as follows. The related work is introduced in Section 2. Section 3
at an acceptable level. Accordingly, our key idea is motivated by the includes system overview in terms of functional split options and task
observation that in several NG-RAN applications such as media processing, offloading process. We present the system model in Section 4. The
image processing, and data mining, a high-accuracy result is not always QLRan problem is formulated in Section 5, followed by presenting a lin-
necessary and desirable; instead, obtaining a suboptimal result with low ear programming-based solution for QLRan optimization problem. The
latency cost is more acceptable by vendors or end users. Consequently,
performance evaluation is discussed in Section 6; finally, we conclude
relaxing QLR in such applications alleviates the required computation
the paper in Section 7.
workload and enables a significant reduction of latency and computing
cost in NG-RAN.
2. Related work
Our Vision: Our objective is to design a holistic decision-maker for
an optimal joint task offloading scheme with quality and latency aware-
ness in a multi-edge NG-RAN to minimize the UEs’ overall offloading In this section, we introduce the key concepts and papers from both
cost. Specifically, we consider a multi-edge node network where each industry and academia over the past several years.
RAP is equipped with an edge node to provide computation offloading
services to UEs. In this way, several key benefits could be brought 2.1. Related concepts and technologies
up to NG-RAN system over the multi-node servers; (i) preventing
the resource-limited edge node/servers from becoming the bottleneck. Several cloud-based task offloading frameworks have been proposed
Usually, the cloud servers overload when serving a large number of in recent years. For example, Mobile Cloud Computing (MCC) has
UEs with high processing priority. By directing many UEs to nearby been proposed as a cloud-based network that can provide mobile
edge nodes, the overloaded can be alleviated; (ii) reducing the energy devices with significant capabilities such as storage, computation, and
consumption and network latency. Each UE has the capability to offload task offloading to a centralized cloud [10]. However, MCC has faced
its task to the RAP with a more favorable uplink channel condition; several noticeable challenges to address the mobile next generation
(iii) getting better network collaboration. The NG-RAN with multi- in terms of end-to-end network latency, coverage, and security. To
RAP set could coordinate with each other to manage and balance the tackle these challenges, Multi-access Edge Computing (MEC) has been
computation resources between the edge servers. In this work, a Latency introduced by European Telecommunications Standards Institute (ETSI)
and Quality tradeoffs task offloading problem, QLRan, is formulated to as an integration of the edge cloud computing systems and wireless
trade off between the service latency and the acceptable level of QLR under mobile networks [11]. One of the key–value features of MEC is to
specific application requirements (e.g., QoS, computing, and transmitting enable rich services and applications in close proximity to end users.
demands). Additionally, the process of task allocation across edge nodes is
With the MEC paradigm, mobile devices have options to offload their
formulated as an objective optimization problem. The optimization objectives
computation-intensive tasks to a MEC server to meet the demanding
include both minimizing the average service latency and reducing the overall
Key Performance Indicators (KPIs) of 5G and beyond, especially in
quality loss.
terms of low latency and energy efficiency. Similar to MEC systems,
Our Contributions: The main objective of this paper is to design fog computing networks are proposed by CISCO systems to bring cloud
the QLRan algorithm, optimizing the trade-off between the application services to the edge of an enterprise network [12]. In fog networks, the
completion time and QLR cost. The main contributions of this paper computation processing is mainly executed in the local area networks
are summarized as follows. and in IoT gateways or fog nodes. Recently, the concept of NG-RAN has
• Subject to transmission and processing delays, quality loss, and been defined by 3GPP as a promising approach to merge edge cloud
computing capacity constraints, we formulate and analyze math- features and RAN functionaries. In industry, many RAN organizations
ematically the QLRan optimization problem in NG-RAN as a have made significant progress in implementing open source-software
Mixed Integer Nonlinear Program (MINLP) that jointly optimizes that supports NG-RAN technology. For instance, EURECOM has imple-
the computational task allocation and QLR levels. The problem mented the OpenAirInterface (OAI) platform [9], which provides an
formulation and analysis trade off optimizing the service latency open, full software implementation of 5G and beyond systems com-
and the overall quality loss. pliant with 3GPP standards under real-time algorithms and protocols.
• The QLRan optimization problem is proved as a non-deterministic Plus, ORAN [13], founded by AT&T, aims to drive the mobile industry
polynomial-time hard (NP-hard) problem. To solve the problem towards an ecosystem of innovative, multi-vendor, interoperable, and
efficiently, we first relax the binary computation offloading de- autonomous NG-RAN with reduced cost, improved performance, and
cision variable and QLR level to real numbers. Then, we utilize greater agility. In general, these open RAN-software projects have a
the Linear Programming (LP)-based method to solve the relaxed high degree of flexibility, such as being able to run CU and DU entities
QLRan problem by using convex optimization techniques. over a fully virtual environment such as VMs or Linux containers, as
108
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117
Fig. 1. Logical diagram for uplink/downlink of gNB with eight functional split options.
109
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117
1. Edge cloud nodes: Initially, a UE searches its communication area end-users to take advantage of the tradeoff between QLR and ser-
for the best edge cloud node to connect to. Hence, the UE will vice latency. For instance, many object recognition algorithms basi-
send a pilot signal and collects response from edge cloud nodes. cally demand specific extraction methods of several numbers of lay-
Any edge cloud node that responds will be considered to be ers with given wavelengths and orientations from image datasets for
a potential candidate. For instance, in Fig. 2, the edge cloud advanced analysis [20]. Hence, the achieved QLR managing the pro-
candidate of the UE is the edge node within the coverage area cessing time in the object recognition can be relaxed if the number
of LTE eNB DU. of extracted layers are properly adapted. Another example is multi-
2. Task classification: After the edge cloud node assignment, the UE bitrate video streaming, in which the Over-The-Top (OTT) video con-
starts uploading the task information to the edge node. Some tent providers (e.g., YouTube, Amazon Prime, Netflix, . . . ) offer to
key information include; (i) the unique ID of the uploading task; end-users different video quality levels to fit within the device’s display
(ii) the application’s layers and requirements; (iii) the task pro- and network connection [6,21]. Adjusting the video quality levels can
file, which include the task constraints (e.g., tolerable latency, save extra computational energy and time for OTT video providers at
QLR level, workload). the same time make users experience good video watching without
3. Task executing: After task classification, the RAP will run a interruption. In this paper, we denote 𝑞𝑘 as QLR level assigned to task 𝑘.
resource allocation algorithm to determine: (i) the service time Hence, we allow each UE 𝑢 to select different 𝑞𝑘 values to exploit the
required for task accomplishment; (ii) computing capacity that is trade-off between processing cost and latency. We define QLR as five
available for the task executing; (iii) the compare these estimates levels in which level 1 refers to the strictest demand for quality, while
to the tasks’ tolerated latency requirement. level 5 represents the highest tolerance for quality loss. In practice, QLR
levels are determined at an application-specific level.
4. System model
110
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117
Fig. 2. System overview of QLRan, in which the gray circle represents the communication range of the RAP.
4.4. System constraints computation capacity generated from processing task 𝑘 at QLR
𝑞𝑘 . Hence, the capacity constraint is model as,
We now introduce the following four constraints to capture the ∑
𝐵(𝑞𝑘 )𝑎𝑠𝑘 ≤ 𝐵𝑠max , ∀𝑠 ∈ . (7)
features of a task offloading multi-node NG-RAN system. 𝑘∈
111
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117
lows: constraint (9b) secure that the computation task can be accom-
plished in the time that cannot exceed than the demanded maximum Proposition 2. Constraints (11c) and (11d) can be relaxed to the
threshold time, 𝜏𝑘max ; constraint (9c) implies that the demand for com- constraint 𝑥𝑠𝑘 = 𝑎𝑠𝑘 𝑞𝑘 .
putation capacity must not exceed its edge node capacity; finally,
constraint (9d) indicates that each task must be assigned as a whole Proof. Case 1: (𝑎𝑠𝑘 = 0, and 𝑞𝑘 ∈ [1, 5]). Form constraints (11c) and
to one edge node. (11d), we can conclude the follows,
112
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117
Fig. 4. (a) CPU utilization vs. number of PRBs for DU and CU in Options IF1 and IF4.5; (b) Memory usage vs. number of PRBs for DU and CU in Options IF1 and IF4.5.
RAM. For the UE, we use a Samsung Galaxy S9 running on Android That is because the higher PHY operations such as RLC/MAC, L1/high,
10. For network configuration, we run our NG-RAN prototype for three tx precode, rxcombine, and L1/low operations reside in DU for split
functional splits; Option F1, (PDCP/RLC, Option 2 in 3GPP TR 38.801 Option IF1 [30]. In Option IF4.5, the pattern is reversed. We can see
standard), Option IF4.5 (Lower PHY/Higher PHY, a.k.a Option 7.x in from Fig. 4(a) that the CPU usage at CU is higher than at DU. Fig. 4(c)
3GPP TR 38.801 standard), and Option LTE eNB. We summarize the shows the memory usage of DU and CU containers when the NG-RAN
testbed configuration parameters in Table 2. testbed performs in Options IF1 and IF4.5 at different values of PRBs.
Fig. 4(a) shows the CPU utilization percentage at DU and CU Similar to the CPU consumption pattern, the memory usage at DU is
containers. The CPU utilization percentage is measured by the docker higher than at DU in Option IF1. For example, the memory usage is
stats command in Ubuntu, which provides a live data stream for running 388 MB in DU while it is 145.3 MB in CU at Option IF1 and 25 PRB.
containers. The downlink UDP traffic repeatedly is sent from the SPGW-
U container to the UE with various PRB settings in two functional split
Options, F1 and IF4.5. It can be observed that the CPU consumption 6.2. Application profiling
for DU and CU is continuing to increase linearly as the number of PRBs
is increased in the two functional split options. However, Option IF1 To test QLRan, we consider two applications: video streaming and
consumes more CPU percentage at DU than at the CU. For example, the facial detection in smart surveillance cameras. These two tasks are both
CPU utilization percentage is 43.67% in DU while it is 14.42% in CU. video-based tasks that require varying degrees of quality.
113
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117
Fig. 5. (a) Memory usage for various QLR levels in video streaming; (b) CPU usage for various QLR levels in video streaming.
Fig. 6. (a) Relation between a video’s bitrate and CPU consumption in video streaming; and (b) Latency in facial recognition.
Video streaming application: Video streaming is run on two Dell 𝐵(𝑞𝑘 ) = −10.4𝑞𝑘 + 95.9, 𝐶(𝑞𝑘 ) = −5.2𝑞𝑘 + 33.3, ∀𝑘 ∈ . (14)
Workstations, each with two Xeon E6-1650 processors. Each worksta-
In Fig. 6(a), since we downsampled the video resolutions ourselves,
tion is equipped with 32 GB of RAM running Ubuntu 18.04. In our we are able to extract the exact average bitrate for various stream
experiments, a prerendered movie of one minute is streamed between profiles to arrive at an equation,
these two computers using ffmpeg, a video transcribing and streaming
application. On the other end, a ffplay is used to receive and render 𝐷(𝑞𝑘 ) = 4.30𝑥 + 2.75, ∀𝑘 ∈ , (15)
the video stream. Four different video resolutions are used: 360 × 240, where 𝑥 represents the achievable bit rate in Mbps. Similarly—as
480 × 360, 960 × 720, and 1920 × 1080. Additionally for the highest shown in Fig. 6(b)—as video resolutions increase in facial recognition
114
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117
Fig. 7. System latency performance versus: (a) QLR levels; and (b) Computing capacities.
application, so does processing time. Hence, the QLR processing time node, the computational complexity at QLRan will increase as well as
can be modeled as, the system latency. The system latency decreases with the tolerance
of quality becoming low. Plus, QLRan shows good performance when
𝑇 𝑝𝑟𝑜𝑐 = −0.08𝑞𝑘 + 0.51, ∀𝑘 ∈ . (16)
the algorithm is acting towards the latency performance. For instance,
when the QLR level increases to 4, the average system latency of QLRan
6.3. Numerical result turns down to 220 ms, 200 ms, and 180 ms, for 𝛿 𝑡 ∕𝛿 𝑞 = 50, 100, and 150,
respectively.
We consider a NG-RAN system consisting of 100 m × 100 m cell with
a RAP in the center. The mobile devices, 𝑁 = 25, are randomly located
6.3.2. Impact of computing capacity of edge cloud node
inside the cell. The channel gains are generated using a distance-
To evaluate the offloading performance in term of memory usage,
dependent path-loss model given as 𝐿 [dB] = 140.7 + 36.7 log10 𝑑[km] ,
𝐵(𝑞𝑘 ), We run the QLRan algorithm for different values of computing
where 𝑑 is the distance between the mobile device and the BS, and
capacity 𝐵(𝑞𝑘 ) at 𝛿 𝑡 ∕𝛿 𝑞 ratio of 50, 100, and 150. We observe that as
the log-normal shadowing variance is set to 8 dB. The other network
long as the memory requirements are sufficient, the computing capacity
parameter values are listed in Table 3.
(CPU/GPU) requirements can be satisfied. Hence, the performance can
In general, the computational tasks can be classified into two differ-
be evaluated with several memory sizes. As mentioned in Table 3, We
ent categories: (i) approximatable, tasks that can be approximated to set the memory size of a edge node to 𝐵𝑠max = 1.5 GB by default,
achieve significant savings in execution time, with however a potential while the ratio of 𝛿 𝑡 ∕𝛿 𝑞 = 50, 100, and 150 are tuned to measure the
loss of accuracy in the result; and (ii) non-approximatable, tasks whose system latency and QLR with several memory capacity values. Also,
execution without any approximation is necessary for the success of the memory size of each edge node is tuned from 0.5 to 2 GB. As
the application, i.e., if any approximation technique were applied on illustrated in Fig. 7(b), the system latency decreases when the memory
these tasks, the application would not generate meaningful results. We capacity of the QLRan algorithm increases. Specifically, the service
refer the interested readers to the work in [32], which introduces a latency decreases by around 12% from tuning the 𝛿 𝑡 ∕𝛿 𝑞 ratio from 50 to
lightweight online algorithm that selects between these tasks to enable 100 at computing capacity is 0.5 GB, while the overall pattern continues
real-time distributed applications on resource-limited devices. Accord- to decrease as the computing capacity value is increased.
ingly, we consider video streaming and facial recognition applications,
which can be considered as approximatable tasks, for profiling. The rea-
6.3.3. Impact of increasing number of tasks
son for choosing these task applications is that they can highly benefit
For the computation task, we use the face detection and recognition
from the collaboration between mobile devices and edge platforms. In
application for airport security and surveillance [33], which can highly
experimental results, we study the impact of the difference of service
benefit from the collaboration between mobile devices and edge plat-
quality level, which can be considered as the resolution level of video
forms. The setting value of 12 computational tasks are selected to be
streaming and facial applications, on the system latency and edge node
in the range of 90 and 250 kB for the data size and between 890 and
computing capacity.
1150 Megacycles for the CPU cycles. Fig. 8(a) shows the performance
of different schemes versus the number of tasks. In this figure, the
6.3.1. Impact of control parameters 𝛿 𝑡 and 𝛿 𝑞 parameter of task data input is a random variable following linearity
We discussed the definitions of the scalar weights 𝛿 𝑡 and 𝛿 𝑞 in Sec- increasing with QLR levels. It can be seen that the case 𝛿 𝑡 ∕𝛿 𝑞 = 150 has
tion 5. In general, these parameters are used to make a tradeoff between less latency cost compared to the other.
the service latency and quality. Specifically, When 𝛿 𝑡 ∕𝛿 𝑞 is increased,
the QLRan algorithm will be more sensitive to system latency; other-
6.4. Comparing QLRan with other baseline approaches
wise, it will be the quality of result sensitive. Fig. 7(a) shows that the
latency cost decreases with a larger QLR parameter for different values
We compare the QRan algorithm with the following existing bench-
of 𝛿 𝑡 ∕𝛿 𝑞 , which are 50, 100, and 150. Specifically, the average system
marks:
latency value at the 𝛿 𝑡 ∕𝛿 𝑞 ratio is 50 and the QLR level is 1 is around
300 ms, while the average system latency values are 275 ms and 250 ms • Cloud Edge Executing Only (CEO): Each UE 𝑢 ∈ has only one
for QLR level is 1 and 𝛿 𝑡 ∕𝛿 𝑞 ratio are 100 and 150, respectively. That is option: to offload its task to cloud edge node within its communi-
because when QLR level is 1, which refers to the best accuracy that can cation coverage without considering the tradeoff between latency
be obtained from processing the computational task in the edge cloud and approximate computing;
115
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117
Fig. 8. (a) System latency performance versus number of computational tasks; and (b) System latency versus number of computation tasks under different execution schemes.
• Local Executing Only (LEO): Each UE 𝑢 ∈ has only one option: References
to execute its task locally within its communication coverage
without considering the tradeoff between latency and approxi- [1] A. Younis, B. Qiu, D. Pompili, QLRan: Latency-quality tradeoffs and task
offloading in multi-node next generation RANs, in: Proc. IEEE/IFIP WONS, 2021,
mate computing;
pp. 1–8.
• Latency-aware ask Offloading (LO): Each UE 𝑢 ∈ can offload [2] A. Younis, T. Tran, D. Pompili, Energy-efficient resource allocation in C-RANs
its task to edge cloud within its communication coverage. Here, with capacity-limited fronthaul, IEEE Trans. Mob. Comput. 20 (2) (2021)
only the latency is considered in the objective function, while 473–487.
[3] I.A. Alimi, A.L. Teixeira, P.P. Monteiro, Toward an efficient C-RAN optical
approximate computing is ignored.
fronthaul for the future networks: A tutorial on technologies, requirements,
challenges, and solutions, IEEE Commun. Surv. Tutor. 20 (1) (2017) 708–769.
As illustrated in Fig. 8(b), we evaluate the running performance of 12
[4] A. Younis, T.X. Tran, D. Pompili, Fronthaul-aware resource allocation for energy
computational tasks under different offloading schemes. Our joint la- efficiency maximization in C-RANs, in: Proc. IEEE ICAC, 2018, pp. 91–100.
tency and quality-aware offloading scheme outperforms other schemes. [5] 3GPP TS 38.300 V2.0.0, NR; NR and NG-RAN overall description; Stage 2, 2017,
Specifically, the performance gap between QLRan and other schemes Release 15.
increases when the number of task increases. That is because the QLRan [6] A. Younis, T.X. Tran, D. Pompili, On-demand video-streaming quality of experi-
ence maximization in mobile edge computing, in: Proc. IEEE WoWMoM, 2019,
algorithm is designed to trade off between the latency and QLR level, pp. 1–9.
while the other schemes only focus on the offloading and executing [7] Y. Li, Y. Chen, T. Lan, G. Venkataramani, MobiQoR: Pushing the envelope of
scenarios. mobile edge computing via quality-of-result optimization, in: Proc. IEEE ICDCS,
2017, pp. 1261–1270.
[8] A. Younis, T.X. Tran, D. Pompili, Energy-latency-aware task offloading and
7. Conclusions approximate computing at the mobile edge, in: Proc. IEEE MASS, 2019, pp.
299–307.
[9] EURECOM, OAI, 2020, Available: http://www.openairinterface.org/.
We presented latency-quality tradeoffs and task offloading in multi- [10] H.T. Dinh, C. Lee, D. Niyato, P. Wang, A survey of mobile cloud computing:
node next-generation RANs. We designed our algorithm, QLRan, to Architecture, applications, and approaches, Wirel. Commun. Mob. Comput. 13
reduce system service latency while adjusting the overall quality level. (18) (2013) 1587–1611.
[11] S. Kekki, W. Featherstone, Y. Fang, P. Kuure, A. Li, A. Ranjan, D. Purkayastha,
Practical NG-RAN system constraints have been considered to formu-
F. Jiangping, D. Frydman, G. Verin, et al., MEC in 5G Networks, ETSI White
late the proposed task offloading problem. The constraints depend Paper 28, 2018, pp. 1–28.
on network latency, quality loss, and edge node computing capac- [12] CISCO, Fog Computing and the Internet of Things: Extend The Cloud to Where
ity, while the objective function is the weighted sum of all the UEs’ the Things Are, Whit Paper, 2015, pp. 1–6.
[13] O-RAN alliance, O-RAN Use Cases and Deployment Scenarios, White Paper, 2020.
offloading utilities. The QLRan is cast as an NP-hard problem; there-
[14] T.X. Tran, D. Pompili, Joint task offloading and resource allocation for multi-
fore, we propose a Linear Programming (LP)-based approach that can server mobile-edge computing networks, IEEE Trans. Veh. Technol. 68 (1) (2018)
be solved via convex optimization techniques. Simulation results are 856–868.
generated from running several real-time applications on the NG-RAN [15] M. Qin, N. Cheng, Z. Jing, T. Yang, W. Xu, Q. Yang, R.R. Rao, Service-
oriented energy-latency tradeoff for IoT task partial offloading in MEC-enhanced
testbed, which is completely implemented under container-based vir-
multi-RAT networks, IEEE Internet Things J. 8 (3) (2021) 1896–1907.
tualization and functional-split option technologies. We considered [16] Z. Zhao, S. Bu, T. Zhao, Z. Yin, M. Peng, Z. Ding, T.Q. Quek, On the design of
video-streaming and facial-recognition applications as building blocks computation offloading in fog radio access networks, IEEE Trans. Veh. Technol.
of many cloud-based applications. We evaluated our solution and thor- 68 (7) (2019) 7136–7149.
ough simulation results showed that the performance of the QLRan [17] D.C. Nguyen, P.N. Pathirana, M. Ding, A. Seneviratne, Privacy-preserved task
offloading in mobile blockchain with deep reinforcement learning, IEEE Trans.
algorithm significantly improves the network latency over different Netw. Serv. Manag. 17 (4) (2020) 2536–2549.
configurations. [18] 3GPP TR 38.801, Study of new radio access technology: Radio access architecture
and interfaces, 2017, Release 14.
[19] L. Wang, S. Zhou, Flexible functional split and power control for energy
Declaration of competing interest harvesting cloud radio access networks, IEEE Trans. Wirel. Commun. 19 (3)
(2019) 1535–1548.
[20] V. Kshirsagar, M. Baviskar, M. Gaikwad, Face recognition using eigenfaces, in:
The authors declare that they have no known competing finan-
Proc. IEEE ICCRD, 2011, pp. 302–306.
cial interests or personal relationships that could have appeared to [21] I. de Fez, R. Belda, J.C. Guerri, New objective QoE models for evaluating ABR
influence the work reported in this paper. algorithms in DASH, Elsevier Comput. Commun. 158 (2020) 126–140.
116
A. Younis, B. Qiu and D. Pompili Computer Communications 184 (2022) 107–117
[22] X. Chen, Decentralized computation offloading game for mobile cloud computing, Brian Qiu obtained a M.S. in the Electrical and Computer
IEEE Trans. Parallel Distrib. Syst. 26 (4) (2014) 974–983. Engineering (ECE) at Rutgers University, NJ, in 2021, where
[23] X. Lyu, H. Tian, C. Sengul, P. Zhang, Multiuser joint task offloading and resource he also received his B.S. in 2019. He is interested in
optimization in proximate clouds, IEEE Trans. Veh. Technol. 66 (4) (2016) distributed mobile computing and in general in networks
3435–3447. with a focus on anonymity and privacy.
[24] Y.W. Bernier, Latency compensating methods in client/server in-game protocol
design and optimization, in: Game Developers Conference, Vol. 98033, 2001.
[25] C.-P. Schnorr, M. Euchner, Lattice basis reduction: Improved practical algorithms
and solving subset sum problems, Math. Program. 66 (1–3) (1994) 181–199.
[26] MOSEK Aps, The MOSEK optimization toolbox v 9, 2019.
[27] E.D. Andersen, K.D. Andersen, Presolving in linear programming, Math. Program.
71 (2) (1995) 221–245.
[28] T.Q. Dinh, J. Tang, Q.D. La, T.Q. Quek, Offloading in mobile edge computing:
Task allocation and computational frequency scaling, IEEE Trans. Commun. 65 Dario Pompili is an associate professor with the Dept. of
(8) (2017) 3571–3584. ECE at Rutgers University. Since joining Rutgers in 2007,
[29] Docker, 2021, https://docs.docker.com/. he has been the director of the CPS Lab, which focuses
[30] OAI tutorials, 2021, https://gitlab.eurecom.fr/oai/openairinterface5g/blob/ on mobile edge computing, wireless communications and
develop/doc/FEATURE_SET.md#enb-phy-layer. networking, acoustic communications, and sensor networks.
[31] D.E. King, Dlib-ml: A machine learning toolkit, J. Mach. Learn. Res. 10 (2009) He received his Ph.D. in ECE from the Georgia Institute
1755–1758. of Technology in 2007. He had previously received his
[32] P. Pandey, D. Pompili, Exploiting the untapped potential of mobile distributed ‘Laurea’ (combined BS and MS) and Doctorate degrees in
computing via approximation, Pervasive Mob. Comput. 38 (2017) 381–395. Telecommunications and System Engineering from the U. of
[33] T. Soyata, R. Muraleedharan, C. Funai, M. Kwon, W. Heinzelman, Cloud- Rome ‘‘La Sapienza,’’ Italy, in 2001 and 2004, respectively.
vision: Real-time face recognition using a mobile-cloudlet-cloud acceleration He has received a number of awards in his career including
architecture, in: Proc. IEEE ISCC, 2012, pp. 59–66. the NSF CAREER’11, ONR Young Investigator Program’12,
and DARPA Young Faculty’12 awards. In 2015, he was
nominated Rutgers-New Brunswick Chancellor’s Scholar.
He served on many international conference committees
Ayman Younis received the B.Eng. and M.Sc. degrees taking on various leading roles. He published about 200
in Electrical Engineering from the U. of Basrah, Iraq, in refereed scholar publications, some of which received best
2008 and 2011, respectively. He is pursuing the Ph.D. paper awards: with more than 13K citations, Dr. Pompili
degree in ECE at Rutgers, NJ, USA, under the guidance has an h-index of 44 and an i10-index of 111 (Google
of Dr. Pompili. His research focuses on wireless commu- Scholar, Oct’21). He is a Fellow of the IEEE Communica-
nications and mobile cloud computing, with emphasis on tions Society (2021) and a Distinguished Member of the
software-defined testbeds. He received the Best Paper Award ACM (2019). He is currently serving as Associate Editor for
at the IEEE/IFIP Wireless On-demand Network Systems and IEEE Transactions on Mobile Computing (TMC).
Services Conference (WONS) in 2021.
117