A survey on deploying mobile deep learning applications a systemic and technical perspective
A survey on deploying mobile deep learning applications a systemic and technical perspective
Yingchun Wang, Jingyi Wang, Weizhan Zhang, Yufeng Zhan, Song Guo, Qinghua
Zheng, Xuanyu Wang
PII: S2352-8648(21)00029-8
DOI: https://doi.org/10.1016/j.dcan.2021.06.001
Reference: DCAN 283
Please cite this article as: Y. Wang, J. Wang, W. Zhang, Y. Zhan, S. Guo, Q. Zheng, X. Wang, A
survey on deploying mobile deep learning applications: a systemic and technical perspective, Digital
Communications and Networks, https://doi.org/10.1016/j.dcan.2021.06.001.
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition
of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of
record. This version will undergo additional copyediting, typesetting and review before it is published
in its final form, but we are providing this version to give early visibility of the article. Please note that,
during the production process, errors may be discovered which could affect the content, and all legal
disclaimers that apply to the journal pertain.
© 2021 Chongqing University of Posts and Telecommunications. Production and hosting by Elsevier
B.V. on behalf of KeAi Communications Co. Ltd.
Journal
Logo
00 (2021) 1–30
www.elsevier.com/locate/procedia
of
ro
-p
Yingchun Wanga , Jingyi Wanga , Weizhan Zhang∗a , Yufeng Zhanb,
Song Guob, Qinghua Zhenga ,Xuanyu Wanga
re
lP
a MOEKLINNS Lab, School of Computer Science and Technology, Xi’an Jiaotong University, Shaanxi 710049, China
b Department of Computing, The Hong Kong Polytechnic University, HK 999077, China
na
ur
Abstract
With the rapid development of mobile devices and deep learning, mobile smart applications using deep learning
Jo
technology have sprung up. It satisfies multiple needs of users, network operators and service providers, and rapidly
becomes a main research focus. In recent years, deep learning has achieved tremendous success in image processing,
natural language processing, language analysis and other research fields. Despite the task performance has been greatly
improved, the resources required to run these models have increased significantly. This poses to a major challenge for
deploying such applications on resource-restricted mobile devices. Mobile intelligence needs faster mobile processors,
more storage space, smaller but more accurate models, and even the assistance of other network nodes. To help the
readers establish a global concept of the entire research direction concisely, we classify the latest works in this field into
two categories, which are local optimization on mobile devices and distributed optimization based on the computational
position of machine learning tasks. We also list a few typical scenarios to make readers realize the importance and
indispensability of mobile deep learning application. Finally, we conjecture what the future may hold for deploying
deep learning applications on mobile devices research, which may help to stimulate new ideas.
Keywords:
Deep learning, mobile computing, distributed offloading, distributed caching
1. Introduction other new mobile devices has further promoted the
development of AI applications on mobile devices.
The development of smart phones, laptops and
In this paper, we define a deep model that has been
∗ Weizhan
trained and applied to a specific service and is de-
Zhang (Corresponding author)
(email:[email protected]). signed to run on mobile devices as mobile deep
2 Wang Yingchun, et al.
learning applications(MDLA). Its training may be power and network bandwidth, yet users may be
cloud-based, or it may be based on edge devices reluctant to download them to their mobile devices.
using federated learning technology, which is not
our focus. The focus of our investigation is the rea- It is very important to study the deployment
soning process of these mobile deep learning mod- of MDLAs on mobile devices, and it makes it
els. Cameras, microphones, and sensors can ob- possible to migrate a large number of centralized
tain various types of information like video, audio, applications to mobile end. Some of them have
and acceleration from the real world. These kinds changed our daily life. Users may have to manu-
of data are then provided to MDLAs. Based on ally record their meal information in the past, but
this, MDLAs have developed rapidly and attracted now it can be achieved just with a smart spoon.
widespread attention due to its tangible benefits for Besides, there are many other important public
users from all sides. For example, MDLAs bene- scenes. For example, by carefully designing the
fit users by performing malicious software detec- offloading strategy, Lu et.al. make it possible
tion [1], app recommendation [2], user verifica- to deploy CNN on the userâĂŹs mobile phone
of
tion [3, 4], mobile visual tasks [5, 6], mobile web to detect important objectives like criminals by
browsing optimization [7], human activity moni- crowd sensing. Without MDLAs, the intelligence
ro
toring [8], medical health monitoring [9, 10] and of the machine will stay in the centralized cloud
other smart fields [11–17] and never appear in front of people on the edge. So
For network operators and third-party service our work is dedicated to investigating some classic
providers, the deployment of MDLA can sup-
port mobile crowdsourcing scenarios [18–21],
-p
and state-of-the-art research on deploying MDLA.
re
distributed machine learning [22–24], federated Recently, two have been explored to solve this
learning [25], multiple smart IoT applications [26], problem. The first addresses this problem from
and other fantastic services using mobile big the perspective of software and hardware of local
lP
data [27, 28], etc. The intelligence in mobile mobile devices, which means that the goal is to
applications is changing the way people live, work run MDLAs locally on mobile devices without
and interact with the world. the help of a third party. The key method is to
na
tions are wonderful, they require a lot of storage, some popular solutions. One approach is to
calculations, and high consumption of power and compress the deep learning model. Even though
Jo
network bandwidth, and users may be reluctant this might influence its accuracy, it decreases the
to download them to their mobile devices. These demand for computation and storage resources.
requirements in contradiction with the limited An essential problem is how to balance the two
resources of mobile devices, becomes a main counterparts [29–35]. Another possible solution
bottleneck for the development of MDLAs. For is to reduce computational needs by reusing
example, CNN, serving as the main method in intermediate computing results [6, 29, 36, 37] or
mobile vision tasks, is executed for each input as maximizing the rate of utilizing device resources
a cascade of layers mainly including convolution through precise dispatching among multiple
layers, pooling layers and FC, and it produces in- deep learning tasks [5, 38, 39]. Moreover, much
termediate results called feature maps, and outputs work has been done to develop deep learning
inference results. Such CNN executions are known frameworks suitable for mobile devices [40–46].
for their high time and space complexity. A typical Finally, improving the hardware of mobile devices
CNN model, such as AlexNet, occupying more to support the operation of MDLAs can be taken
than 200M of memory, has 60 million parameters into consideration.
and needs 720 million FLOPs. The other, VGG-
16, occupying more than 500MB of memory, has Another research direction is to gain support
138 million parameters and needs 15300 million from background servers with sufficient resources.
FLOPs. Even though these smart mobile vision Here, ’sufficient resourcesâĂŹ means that the cur-
applications are wonderful, they need so much rent resources of background servers are sufficient
storage, calculations, and high consumption of for the running of tasks to be offloaded and that the
A survey on deploying mobile deep learning applications: a systemic and technical perspective 3
of
50]. Second, in 2014, ETSI and IBM jointly
established the Mobile Edge Computing(MEC)
ro
of ad-hoc cloudlets due to the closer distance.
standardization group, and formally proposed
the concept of standardized MEC [51]. MEC
Some investigations have been carried out in
offers IT and cloud computing in a Radio Access
Network (RAN); its location is closer to mobile
users, which further reduces transfer latency.
-p
related fields. Mao’s work[58] mainly discusses
mobile edge computing from the perspectives
re
However, due to the limited resources compared of communication technology and computing
with cloud resources, problems such as multi-user resources, and studies the offloading of mobile
computing to edge servers through computation
lP
nication in recent years enables mobile devices ing cases and service environments, followed by a
to transfer computing to other nearby computing detailed illustration of the standardized MEC in-
devices like smart phones, home-use computers, frastructure. Finally, this work discusses three key
ur
etc. [57]. Even though such a strategy minimizes areas of MEC computation offloading: offloading
the distance between task-initiated devices and its strategy, edge server resource allocation and user
Jo
computing location, considering the complicated mobility management. Dao’s work [60] discusses
mobility of users and the resource availability of how to deploy multimedia applications using deep
the cloudlet, the offloading strategy needs to be learning on mobile devices. This paper focuses
delicately designed within its limited coverage on the local computation of deep multimedia
area. In addition, the caching of services and data applications on mobile devices and discusses from
on background servers and communication opti- two aspects of software and hardware. Kumar’s
mization between mobile devices and backgrounds work[61] focuses on the distributed deployment
are also key research areas, but these are not the of mobile computing tasks, surveying it from the
focus of this paper. perspective of computation offloading. However,
in these works, we can learn the running of
Fig 1 shows an overview of deploying MD- MDLA only based on certain aspects: offloading
LAs. Mobile user can execute local computa- its computation to certain places such as edge
tion, or choose distributed deployment(by offload- servers, only learning distributed deployment
ing). The up arrow indicates the offloading pro- using offloading techniques and only discussing a
cess. Computing tasks from mobile devices are special kind of MDLA such as deep multimedia
firstly offloaded to the edge server. If the edge applications. None of these works studied MDLAs
server cannot satisfy its resource needs, it can be in a comprehensive way.
further offloaded to the remote cloud with sufficient
resources. We can also use D2D communication to This paper studies MDLAs perspectives of sys-
transfer some relatively light computing to devices tematization and networking, and conducts a com-
4 Wang Yingchun, et al.
prehensive survey on the challenges and corre- model compressing and this paper summarize them
sponding state-of-the-art solutions from two direc- based on space-memory and time-computation.
tions: local optimized running and distributed de-
ployment. The rest of this paper is organized as fol- 2.1.1. Reducing model spatial complexity
lows: Section 2 introduces the deployment of MD- The spatial complexity of deep neural networks
LAs on mobile devices; Section 3 illustrates the is determined by the number and size of param-
distributed deployment of MDLA; Section 4 clas- eters in the deep model. By reducing the spatial
sifies MDLAs according to different beneficiaries; complexity of the deep model, its memory can be
Section 5 lists future possible research directions greatly reduced. Based on this, we can classify
of related fields; and Section 6 summarizes this ar- related works into two categories: cutting down
ticle. the number of model parameters, including prun-
ing and sharing, and reducing its size like weight
quantization. The related specific researches are as
2. Deployment on mobile devices follows.
of
In this section, we discuss how to run MDLAs
a) Pruning
ro
locally on mobile devices. Its main idea is to re-
Pruning aims to reduce the number of model
duce the resource requirements for running deep
parameters. The basic idea is to select and delete
learning tasks or to design deep learning frame-
works suitable for mobile devices to optimize an
MDLAâĂŹs computation. To reduce resource us-
-p
some trivial parameters that have influence on the
modelâĂŹs accuracy, and then retrain the model
to recover the model performance.
re
age, we can make the following considerations:
firstly, the natural idea is to reduce the amount
Nonstructural pruning removes trivial neurons
of calculation and storage space required by the
lP
iteration ends when the pruned model fails to binarization scheme and executes binarization
reach the minimum accuracy set by the users. both on the input layer and the weight layer of the
CNN. Specifically, this method recursively per-
Middle hidden-layer pruning directly deletes forms residual quantization to improve the effect
some layers in networks. Rueda et al. [32] propose of binary approximation. Yin et al. accelerate
a method to maximize units to combine neurons the operation of DNN on low-power computing
into more complex convex functions and select hardware by understanding Straight-Through
neurons based on the local relevance of each neu- Estimator in training activation quantized neural
ronâĂŹs activation on the training set. Li et al. [62] nets [64].
divide the network layers into weight layers, such
as convolutional layers or fully connected layers, d) Direct design of small models
and non-weight layers, such as pooling layers or Some works expect to yield small and accurate
activation layers. The non-weight layer incurs less deep models by more delicate model design or di-
theoretical computation but a longer computation rect training of promising network architectures
of
time due to memory data access time and for other rather than iterative execution of training and prun-
reasons. The authors combine a non-weight layer ing.
ro
with a weight layer and eliminate independent For instance, the authors of [65] propose a
non-weight layers, which significantly reduces the small CNN architecture called SqueezeNe and put
inference time.
b) Parameter sharing
-p
forward a network architecture similar to inception
and call it a "Fire Module", which consists of
a squeeze convolutional layer with 1 × 1 filters
re
There are weighted edges between layers, and and an expanded convolutional layer with both
the number of weight parameters will increase 1 × 1 and 3 × 3 filters. It reduces the computation
lP
with a rise in the number of nodes. Therefore, overhead of convolutional layers by reducing the
to reduce parameters, approaches are designed size of filter kernels and the number of channels.
to share the weights of certain edges. A simple
example is that if every layer has 1000 nodes, then
na
For example, the main idea of Han’s work[63] depthwise convolution layers.
is to compute the multiple clustering centers of
weights using k-means clustering. This approach Lin et al. [35] attempt to overthrow the previ-
clusters weight to the nearest centroids and com- ous idea of network pruning. Their goal is not to
pensates for the weight by fine-tuning training. prune after training a large model, but to design a
Chen hashes parameters into corresponding hash small model architecture and train it at the begin-
buckets, and parameters in the same bucket share ning. By a large number of assessments of state-
a single value[33]. of-the-art pruning algorithms, they found that it is
the network structure that counts, rather than which
c) Network quantization weights are retained.
Network quantization compresses initial net-
works by reducing the number of bits for each
weight parameter. Quantization uses low precision 2.1.2. Lowering time complexity
data to represent the original high precision data. Time compression of models means lowering
The traditional technique is to convert 32-bit the inference time complexity of deep models
floating-point to 16-bit floating-point, 16-bit with no notable increase in the training time
fixed-point, 8-bit fixed-point and so on. There are complexity. By reducing the time complexity of
some related technologies like Binarized Neural the deep model, we may be able to reduce the
Networks, Ternary Weight Networks, XNOR Net- amount of battery power it required. Common
work.In Li’s work[34], he proposes a higher-order methods that reduce the time complexity are as
6 Wang Yingchun, et al.
of
low-rank tensor decomposition to eliminate redun- them has different trade-offs between model size
dancy in the convolution kernels. Their method and accuracy. Manually balancing these trade-
ro
obtains the global optimizer of the decomposi- offs and designing deep models for each of them
tion, and based on this, they use a new method to are very difficult because there are so many fac-
train low-rank constrained CNNs from scratch[67].
b) Computation acceleration
-p
tors to consider. In the most recent work [72],
the authors propose an automated Mobile Neural
Architecture Search (MNAS) approach using deep
re
Computation acceleration usually speeds reinforcement learning to search for a model for
forward propagation by improving matrix multi- a specific trade-off. They also propose a novel
lP
plication, abating numeral resolution, etc. Park’s factorized hierarchical search space to obtain a
work proposes a kind of multiplication based on good trade-off between flexibility and search space
tensors to achieve efficient computation between size. Using deep learning and automatic search-
a dense matrix and a sparse matrix. It pre-locates ing avoids the complexity of manual design and
na
the values that will be zero to avoid calculating is a promising new direction to build mobile deep
them when multiplying matrixes[68]. models. And for these lightweight convolutional
neural networks (such as MobileNets) designed for
ur
first proposed by Bulica in 2006[69], and Hinton Conv) are their key operations. Taking into account
summarized and developed it in 2014[70]. The the characteristics of current mobile hardware and
main idea of knowledge distillation is to train a software systems, Zhang et al. proposed tech-
small network model to imitate the knowledge niques to re-optimize the implementations of these
of a large network trained in advance, similar to operations based on the ARM architecture [73].
the relationship of teacher and student. The large
network is the “teacher”, and the small network 2.2. Reuse of intermediate results
is the “student”. The general method is to learn
softmax classification outputs of teacher models. In addition to modifying the structure of deep
For example, Hinton et al. reduce the amount of learning models, it is applicable to reuse the in-
computation in a deep learning network by fol- termediate computing results of deep models by
lowing a less cumbersome model. To transfer the caching part of the computing results based on the
generalization ability of the cumbersome model partial similarity of the data. As a result, it can
to a small model, they use the class probabilities decrease the modelâĂŹs computational resource
produced by the cumbersome model as "soft needs. The intermediate results to be reused can
targets" for training the small model and add a be considered at multiple granularities, including
small term to the objective function to encourage the middle layersâĂŹ computational results for the
the small model to predict the true targets. same model and a similar input, a shared similar-
ity search for different models and the same in-
d) Data transformation put, similar semantic computational results for the
A survey on deploying mobile deep learning applications: a systemic and technical perspective 7
same models on different devices and a similar in- different tasks. The abstract features computed at
put, etc. The basic idea of these techniques is to re- the bottom could be used by multiple high layers.
duce the computational resources of running MD- Hence, the main idea of data reuse among multiple
LAs. deep tasks is that feature representations computed
by lower layers could be shared among different
2.2.1. Data reuse among image frames high layers to save computational costs.
Data reuse among image frames is often applied
to the same model and similar input, typically in The authors of MCDNN apply the idea above
continuous mobile vision. It is a serial mobile and present "class" clustering, a DNN classifier
video image stream obtained from omnipresent specialized for similar tasks. They aim to dominate
cameras on mobile and wearable devices to sup- context and provide similar classes with special-
port diverse vision MDLAs, including recognition ized, light models. More importantly, the model
assistance, lifestyle monitors, street navigation, need to perform well in recognition when the in-
etc. The CNN is a state-of-the-art vision process- put does not belong to any of the classes and then
of
ing methods, which can be regarded as a group of classifies it as the "other" class. Concatenating this
stacked layers. Each input frame generates inter- specialized model with a generic model, and only
ro
mediate results called a feature map and outputs if the specialized model reports the input as the
reasoning results. Because of these characteristics, "other" class does the general model perform fur-
we can reuse its layer processing results for similar
continuous images.
-p
ther classification[36].
that is, similarity between frames with similar time. similar inputs. There are many scenarios of run-
Referencing the heuristic approach in video com- ning the same MDLA on adjacent multiple devices,
pression, DeepCache propagates areas of reusable and these application cases often process similar
results in frames using the modelâĂŹs inner struc-
na
successive frames in the first-person video. It com- designed Adaptive Locality-Sensitive Hashing (A-
putes the current frame by reusing the middle re- LSH) and Homogenized k Nearest Neighbors (H-
sults of the previous frame through the inner struc- kNN). The former achieved extensible and con-
ture of the convolution layer rather than simply stant lookup, while the latter provided high-quality
reusing the final output. reuse and a tunable accuracy guarantee[37].
2.2.2. Data reuse among multiple deep tasks
2.3. Resource dispatch among multiple tasks
Data reuse among tasks is often implemented
for different deep models and the same input. MD- There is usually more than one MDLA running
LAs might have several models for different but on a mobile system. Joint resource dispatch
related tasks with the same input executed during a optimization among all these deep models instead
similar time, and it is a waste of resources to repeat of independent optimization for a single deep task
the underlying feature extraction calculation. The can maximize the performance sum of all MD-
popular idea is to share partial computing results LAs. For instance, the architecture of DeepEye
of models for multiple tasks. For example, an removes a crucial limitation of executing multiple
MDLA may aim to infer the race, age and gender deep learning models on resource-limited mobile
of the user. One choice is to train a DNN for each devices by presenting a novel inference software
task, which incurs a cost of four inferences. DNNs pipeline. Its goal is to combine the execution
can be treated as bottom layers for extracting the of heavy computational convolution layers with
feature representation and high layers aiming for high-memory-cost fully connected layers, which
8 Wang Yingchun, et al.
realizes partial execution of multiple models training process while bringing little profit in the
simultaneously, especially for CNNs [38]. inference phase. Recently, some work has focused
on developing professional software libraries
NestDNN considers that resources available in to train and deploy deep models on resource-
mobile devices are dynamic due to events such as limited mobile devices. The general idea is to
starting new applications, closing current applica- combine traditional frameworks and accelerate
tions or modifying the application priority. Based the inference process of trained networks to sig-
on this consideration, the approach presents a nificantly reduce the resources of running MDLAs.
multitask resource-aware deep learning framework
for mobile vision systems. It selects the optimal For example, DeepLearningKit supports using
resource balance and accuracy for each deep CNNs on mobile devices possessing a GPU under
learning model dynamically so that the models’ an IOS system. It can speed up the inference
resource needs are compatible with the available phases of deep models trained by Caffe[74].
resources in the runtime system[5]. DeepX, as a software accelerator, also optimizes
of
the computation, memory, and energy cost of the
Geng et al. solve an energy-saving local core- inference phases of deep networks trained previ-
ro
offloading problem for multiple deep learning tasks ously. Its method is to divide the networkâĂŹs
running on multicore mobile devices. They for- computation into simpler pieces that can be ar-
malize the problem as a mixed-integer nonlinear
programming problem and then propose a heuristic
algorithm to jointly solve the offloading decision
-p
ranged more efficiently. Each piece can be run on
different processors (e.g., GPU, CPU) to achieve
sufficient utilization of the computation ability of
re
and task scheduling problems. This strategy prior- mobile devices[75].
itizes tasks of various applications to satisfy both
lP
application time constraints and task-dependency In Table 1, we list some current state-of-the-art
requirements. To reduce the search cost, they re- deep learning frameworks on mobile devices.
cursively inspect tasks and move them to the right
CPU cores to minimize the energy cost[39].
na
learning frameworks on resource-sufficient com- MDLAs and presents four ideas: reducing deep
model complexity, reusing intermediate computing
putation platforms, such as Caffe, TensorFlow,
results, performing resource dispatch among multi-
Jo
Table 1: An overview of the popular deep learning frameworks for mobile terminals
Name Company Android IOS CPU GPU DSP Time Open Characteristic Training
source
TensorFlow Google X X X X 2017 X Lightweight, ×
lite [40] cross-platform,
fast
Caffe2 [41] Facebook X X X X 2017 X Lightweight, X
modular, scalable
core ML2 Apple × X X X 2018 × Weight quan- ×
[42] tification, batch
forecasting
NCNN [43] Tencent X X X × 2017 X Cross-platform, ×
of
no third-party
dependence,
ro
compiler-level
optimization is
-p extremely
and scalable
fast
re
Feather Tencent X X X × X Lightweight X
cnn [44] (hundreds of
KB), no third-
lP
party depen-
dency, scalable
SNPE [45] Qualcomm X × X X X X Executes any ×
na
depth model,
greatly limited by
hardware
ur
compiler-level
optimization,
supports various
framework model
transformations
Paddle Baidu X X X X X 2017 X Multi-hardware ×
model [76] platform, deep
model quantiza-
tion compression
of
the filters as well as to prune the filters iteratively
Output units are maximized and multiple neurons
ro
[32] are merged into more complex convex function rep-
resentations
The parameters are mapped to the corresponding
[33] -p Hasi bucket, where parameters in the same bucket
have the same value
re
The weight and input of CNN network layers are
[34] binarized so that the computing speed is faster and
the memory consumption is smaller
lP
age analysis and a great deal of image rendering the orchestrator to perform server assignment and
to powerful cloud servers [50]. This kind of AR frame resolution. In their edge-based MAR sys-
MDLA is computationally intensive, has a high tem, mobile tasks are first offloaded to the orches-
power cost and is a typical application type to be trator and then sent to the edge server[56].
offloaded to clouds. The authors use Pokemon Go, In recent years, ad hoc mobile cloudlets [57]
a popular AR game, to test their platform. have emerged as a closer offloading point for mo-
However, the mobile cloud still faces multiple bile users. They can offload their computation to
challenges. Obviously, because of its relatively peer devices ad hoc to save local energy and re-
long distance from users, mobile cloud computa- sources. When we use cloudlets, there are more
tion is not suitable for all offloading cases, espe- communication methods than MEC and MCC,
cially for interaction-intensive applications. MCC such as Bluetooth, Wi-Fi Direct, and other di-
also places massive additional loads on the radio rect communication techniques, which may lead to
and backhaul of mobile networks. Dinh et al. list lower latency. Van et al. discuss the optimal offload
the technical challenges of MCC in detail [81]. In decision for mobile users in ad hoc mobile clouds.
of
mobile communication, because of the character- Mobile users can offload computation to nearby
istics of wireless networks, e.g., lack of wireless cloudlet devices through Device-to-Device (D2D)
ro
resources, traffic congestion and multi-rates, MCC communication-powered cellular mobile networks.
faces challenges, including low bandwidth, ser- The authors developed an offload scheme based on
vice availability, heterogeneity, etc. Regarding the
computational aspect, there are difficulties such as
efficient and dynamic computation offloads under
-p
Deep Reinforcement Learning (DRL), aiming to
make an optimal offloading policy by considering
the uncertainty of users and cloudlets as well as the
re
variable conditions, user and data security prob- resource availability of cloudlets[82].
lems, productivity of data access, and contextual
lP
awareness. These are not the focus of this article. 3.1.3. Collaboration of remote cloud and edge net-
works
3.1.2. Edge network MCC offers high-capacity service and rich com-
The long distance from the users and restricted puting resources, but it is too far from mobile ter-
na
backhaul bandwidth make it difficult for MCC to minals and faces high latency. MEC is closer to
cope well with all offloading scenarios, especially mobile users and offers low latency, but it has lim-
for latency-sensitive and interaction-intensive MD- ited computing resources and low-capacity queries.
ur
LAs in recent years. Over time, people have con- Ad hoc mobile cloudlets are the closest to mobile
tinued to explore how to promote resources and users and offer more communication methods and
Jo
services closer to users to reduce access delays and low latency. Users can access cloudlets through
energy consumption. Then, mobile edge comput- one hop in the wireless network, but resources on
ing, ad hoc clouds, and other novel computational cloudlets are extremely limited and offer few ser-
architectures emerged. Mobile Edge Computing vices. Therefore, when multimobile users offload
(MEC) is a promising solution to compensate for their tasks to cloudlets, resources on cloudlets are
the MCC problem. It is closer to mobile users by likely to be depleted, and the rejection rate of new
providing IT and cloud computing in wireless ac- requests is high. Each type of MDLA has dif-
cess networks [51]. As one of the key techniques ferent characteristics, such as latency sensitivity
in the 5G era, MEC has some advantages, includ- and computation intensity. Therefore, combining
ing low latency, high bandwidth, real-time wireless MCC, MEC and ad hoc mobile cloudlets, learn-
network information, and location awareness. In ing from others’ strong points and closing the gap
recent years, there have been many works studying would be a promising approach for MDLAs.
offloading mobile computation to MEC servers. For example, Teerapittayanon proposes a Dis-
For instance, Liu et al. study the offloading of tributed Deep Neural Network (DDNN) based
computation-intensive Mobile AR (MAR) tasks to on distributed computing hierarchies, including
servers in edge networks. They design an edge net- clouds, edge networks and terminal devices. The
work orchestrator and build a measurement-based DDNN can accommodate DNN inference in the
analytical model to express the trade-off between cloud, as well as rapid and local inference at edge
latency and accuracy. They also propose a cor- servers and terminal devices using shallow parts
responding algorithm as the core component of of the DNN. Under the support of extensively dis-
12 Wang Yingchun, et al.
tributed computing hierarchies, a DDNN can ex- Josilo et al. study the resource allocation prob-
tend neural networks and geographic scales[83]. lem of multiple self-benefiting mobile users of-
Its distributed method also improves the sensor fu- floading to mobile clouds[52]. They define this
sion ability, fault tolerance and data confidentiality as a non-cooperative game problem and present
of DNNs and can be applied to large MDLAs be- an efficient decentralized algorithm to jointly op-
cause of its more robust and safer operation. timize their offloading strategies. Their algorithm
In [84], Vitor et al. provide a new strategy to converges to a pure-strategy Nash equilibrium. Fi-
simplify the combination of cloud and fog facilities nally, an upper bound for the price of anarchy in
in IoT scenarios, called the Combined Fog-Cloud the game is provided for the two cloud resource
(CFC). It introduces the QoS-aware service allo- models they propose.
cation problem and expresses it as an integer opti- Liu et al. introduce a joint multi-resource alloca-
mization problem to satisfy capacity requirements tion framework located in a cloud computation sys-
as well as minimize latency. tem based on the Semi-Markov Decision Process
Lin et al. coordinate the computational re- (SMDP). The goal of the framework is to maxi-
of
sources of cloudlets and remote clouds to fully uti- mize the overall advantages of the entire system by
lize these two systems. Additionally, they develop constructing an optimal strategy of wireless band-
ro
a system reward model for wireless bandwidth and width and computing resource allocation for mul-
computational resource allocation. They formu- tiple mobile users in MCC that has a low service-
late the problem as a semi-Markov decision pro-
cess and use the LP solver tool to solve it as a linear
programming problem[53].
-p
denial rate and processing latency[53].
Zheng et al. adopt a multi-user stochastic game-
theoretic approach in an MCC dynamic offloading
re
environment. Mobile users are in a dynamic state,
3.2. Computation mode active or inactive, and radio channels vary stochas-
lP
strategy, that is, running MDLAs locally, perform- tential strategy with at least one Nash Equilibrium
ing complete offloading or dividing MDLAs into (NE). They propose a multiagent stochastic learn-
several independent subtasks to partially offload. ing algorithm with convergent speed to solve the
ur
idation of calculation results. Keshtkarjahromi et get is to minimize the amount of centralized of-
al. propose a Coding Cooperative Computing Pro- floading to the cloud caused by a lack of service
tocol (C3P). It dynamically offloads encoded sub- caching or insufficient resources at the edge. Then,
tasks of MDLAs to multiple computable locations they propose a bicriteria algorithm that provably
in the edge network and can adapt to time-varying achieves approximation guarantees while violating
edge resources [85]. resource constraints in a bounded way [87].
Liu et al. primarily focus on usersâĂŹ request-
3.2.4. Multi-servers multi-users routing problems. They study dynamic alloca-
Under multi-server multi-user conditions, there tion of usersâĂŹ offloading requests under multi-
are more problems to be noted. We define them as ple edge servers in a MAR system[56].
follows: Chen’s work focuses on resource-allocation and
Resource allocation on a single server: Re- request-routing problems. It studies offloading
sources on edge servers are limited and may fail in a superdense computing network based on the
to satisfy all requests from the covered area. We idea that software defines the network. The au-
of
define the allocation of limited resources on servers thors transfer the offloading strategy as an NP-hard
among multiple users as a resource allocation prob- mixed integer nonlinear programming problem and
ro
lem. further divide it into two subproblems: the con-
UsersâĂŹ routing requests on multiple servers: vex subproblem of resource allocation and the 0-
The density of base stations in the 5G era will
reach 50 BSs/km2 [86], which will lead to users
being located in a complex multi-base station en-
-p
1 program subproblem of request routing. They
use alternative optimization techniques to find a
solution[88] .
re
vironment with overlapping areas. Such compli- The work of [89] mainly focuses on the
cated, multi-unit situations make it difficult for resource-allocation problem and resource sharing
lP
users to decide where to offload their MDLA tasks among multi-edge computing servers. It mod-
to achieve the optimal performance of MDLAs, els these problems as a multi-objective optimiza-
and we define this as a request routing problem. tion problem and constructs a framework based on
Placement of service: Services can be cached on Cooperative Game Theory (CGT) in which every
na
edge servers not far from users to provide lower edge server first satisfies its own offloading request
service access latency. There are three prerequi- and then shares the remaining resources with other
sites: (i) edge servers can only cache a limited servers. The authors present an O(N) algorithm
ur
number of services; (ii) usersâĂŹ requests can only and achieve Pareto optimal allocation.
be routed to servers with the service they request;
Jo
and (iii) users are in a complex multiunit environ- 3.3. Offloading decisions
ment. We must decide how to allocate various ser- Offloading the tasks of various MDLAs to back-
vices among multiple edge servers to respond to grounds with sufficient resources is the basic idea
more requests and maximize overall performance. of the distributed deployment of MDLAs. A key
We define this as a service placement problem and challenge when we make an offloading decision is
discuss its solution in subsection C. deciding when and how to offload, because offload-
Balancing offloading among edge servers: The ing is not always beneficial; unstable network con-
distribution of mobile users presents high spatial ditions, frequent interactions or large amounts of
variety and mobility. These characteristics cause input data may lead to large transmission latency
an imbalance in the workload among edge servers and high energy consumption. However, making
and influence the request-response time. We can the best offloading decision is not an easy and
define two problems from this: a load-balancing straightforward task. For different MDLA types,
problem and a resource-sharing problem. there are different factors to be considered and dif-
With the definitions of these problems, we can ferent weights of demands, including accuracy, la-
now provide some examples of offloading un- tency, energy consumption, etc. We also need to
der multi-server multi-user modes. Each example consider the state of the whole system, includ-
faces one or more of the problems above. ing the device temperature, current task number,
Poularakis et al. formulate the Joint Service network type, state of the background server, etc.
Placement and Request Routing problem (JSPRR) The inherent complexity and diversity of these fac-
in a multi-unit MEC network. Its optimization tar- tors have led to a variety of studies on comput-
14 Wang Yingchun, et al.
ing offloading decision-making. For the offloading ponents and the offloading sequence. The compo-
mode, computing offloading can be divided into nents higher in the executive order have a lower of-
two strategies: complete offloading and partial of- floading priority. (ii) Hardware-constrained com-
floading. Decision-making methods can be divided ponents must be executed locally on mobile de-
into rule-based and learning-based methods. vices; for example, in a mobile video analysis
application, we obtain an image or video stream
3.3.1. Offloading mode through a camera on the mobile device that cannot
A typical MDLA can be simply divided into be offloaded. (iii) The size of data exchanged be-
three parts: data acquisition, data preprocessing tween components and the amount of computation
and data analysis. Data acquisition often requires of each component should also be taken into con-
hardware to be integrated into mobile devices. sideration. The tendency is usually to offload com-
Therefore, due to the limitation of hardware set- ponents with a large amount of computation or less
tings, it must stay on mobile devices and cannot data traffic with other components, or to offload in
be offloaded. For the other two subtasks, the op- the reverse order of execution.
of
timal offloading decision should be made on the We may consider three aspects of potential
premise of comprehensively considering the re- strategies: (i) offloading some subtasks of the
ro
sources needed, the amount of communication data MDLA to the background to reduce the calcula-
between subtasks, battery power and the current tion delay and energy consumption of mobile de-
network bandwidth. In addition, it should be noted
that during partial offloading, the order of offload-
ing among subtasks is reverse to the running order.
-p
vices; (ii) offloading part of the processed data in-
stead of all initial data to reduce the transmission
delay and energy consumption; and (iii) protecting
re
a) Complete offloading the security and privacy of user mobile data. We
Highly integrated or relatively simple tasks can- give some examples of previous work to illustrate
lP
ing is considered to be the current optimal solu- frame extraction and frame detection. Video cap-
tion. In [58], the authors define such a computing ture can only be done by cameras on mobile de-
task model as a binary offloading task model and vices and must be performed on mobile devices lo-
ur
use three field symbols to represent its properties: cally because of hardware limitations. Video frame
the task input data size, time limitation and calcu- extraction and frame detection can be offloaded se-
Jo
lation workload. These three features are the basic quentially according to network connection condi-
attributes of an MDLA, and we can essentially use tions and battery power. Notably, the order of of-
them and the current network bandwidth dynamics floading of these two subtasks is limited. The first
to make offloading decisions. subtask to be executed is the last subtask to be of-
In one study [59], Pavel Mach and Zdenek Bec- floaded.
var investigate complete offloading from three per- For point b, Jain et al. aim to use environmen-
spectives: minimizing latency, minimizing energy tal fingerprinting to achieve immersive, highly con-
consumption under delay constraints, and trading textualized MDLAs, especially MAR. This visual
off delay and energy consumption. In another diversity requires matching match a unique visual
project [50], the authors present a platform for signature with millions of databases. Its computa-
offloading MAR tasks to powerful cloud servers tion is heavy, and it needs to offload considerable
completely. They implement this system using a visual data to the cloud. The authors identify the
thin-client design. low-entropy characteristics of visual "features" and
b) Partial offloading design a system named VisualPrint to offload only
An MDLA consists of many components and the most distinctive visual data, reducing the time
can be divided into multiple partitions to achieve of network transmission [49].
fine-grained (partial) computing offload. For point c, when offloading the data of an
In partial offloading, we must note the following MDLA to the background for better and faster ex-
three points: (i) The dependence of partitions and ecution, mobile users face the risk of data privacy
components influences the executive order of com- exposure. Partial offloading is beneficial to privacy
A survey on deploying mobile deep learning applications: a systemic and technical perspective 15
protection in data exchange [22] [23] [20]. Inter- late the trade-off of network latency, computational
mediate data in deep learning models usually have latency, and analytic accuracy in MAR systems
semantics different from those of row data. For ex- and develop a multi-objective optimization prob-
ample, it is difficult to understand original informa- lem to select the optimal edge server and video
tion by only observing the features extracted from frame resolution for MAR users. They design a fast
the original data by CNN filters. Therefore, we and accurate (FACT) algorithm to solve this multi-
can offload high layers of the deep learning model, objective optimization problem based on convex
and then offload abstract data processed at the bot- optimization theory.
tom layer of the mobile side to the background. In In other fields, for offloading of one MDLA, the
many works of distributed deep learning, the model following works make rule-based offloading deci-
on mobile devices is regarded as a worker, and is sions. Sundar et al. study offloading decisions
combined with a central server to train the whole of MDLA in terms of a set of dependent tasks in
deep model. Partial local computing abstracts the a general cloud computing system consisting of a
user’s private data to a certain extent and protects heterogeneous local processor network and a re-
of
data privacy and safety when it is offloaded to the mote cloud server. Their optimization target is
central server. This can also be applied in a similar to minimize the execution cost of the entire ap-
ro
way to most mobile crowdsourcing scenarios. plication under each subtask completion time con-
straint. They propose a heuristic algorithm named
3.3.2. Decision method
a) Rule-based offloading decisions
A rule-based offloading decision usually consid-
-p
ITAGs to solve this NP-hard problem[80].
The work of [92] aims at minimizing task delay.
This optimization problem takes the queue state of
re
ers whether to offload as the output result of a com- the task buffer, the execution state of local process-
binatorial optimization problem under a set of con- ing units and the state of transmission units as in-
lP
straints. This method formulates the problem by puts to determine whether to offload completely.
measuring the context of current task execution un- Xu, Chen and Zhou regard the minimization of
der specific constraints and optimizing objectives, the computational delay and device energy con-
and uses certain mathematical knowledge to solve sumption on the server as the optimal target, and
na
the problem and output the decision scheme. their constraint condition is the cache capacity
In the field of vision, we list some works of rule- of the edge server, the maximum delay limita-
based offloading decisions. In [90] and [91], Ran tion of tasks, and the battery power of the device.
ur
et al. make extensive measurements to understand Their system outputs a service placement layout on
the trade-offs between video quality, network con- servers and an offloading decision for devices.
Jo
ditions, battery consumption, processing delay, and Third, for offloading multiple MDLAs on mul-
model accuracy, and formulate them as an opti- tiple mobile devices, we list the following works,
mization problem; then, they use a measurement- most of which jointly optimize the offloading de-
driven mathematical framework to efficiently solve cision of these tasks. The authors of [93], [94],
this combinatorial optimization problem. and [95] study the joint optimization problem of
In [18], Lu focuses on mobile video analysis; the multitask offloading under multiple mobile de-
task publisher needs mobile crowdsourcing videos vices. They measure the arrival rate of the data
to identify specific objects. A video crowd pro- packet of each time slot and the current network
cessing platform is designed and offloading deci- channel conditions as input, use the complete time
sions are made in both Wi-Fi and mobile cellular limitation of each task as a set of constraints, and
network connections. Under Wi-Fi connection, the aim to minimize the energy consumption of these
optimal goal is minimizing completion time, and mobile devices. They finally output the offloading
an algorithm named split-shift is proposed. Under decision for each mobile device and the allocation
cellular connections subjected to data usage con- of wireless resources and computing resources on
straints, the optimization goal is the trade-off be- the server among multiple tasks.
tween processing time and energy consumption. The work of [96] also considers the joint opti-
The authors of [56] consider dynamically as- mization of multitasks on multi-mobile devices. It
signing the workload of the mobile AR system to minimizes the trade-off between the energy con-
multiple mobile edge servers to maximize the per- sumption of mobile devices and task execution la-
formance of the MAR system. Liu et al. formu- tency and the output offloading decision for each
16 Wang Yingchun, et al.
task and its optimal wireless channel selection. tasks in certain fields, compared with the conven-
b) Learning-based offloading decisions Mak- tional method, it is a good way to further improve
ing offloading decisions based on learning begins the performance of offloading.
by collecting, quantifying and characterizing the
current running context of the program, including 3.4. Distributed cache
battery power, application properties, user mobil- In the broad use of MDLAs, we can observe
ity, network status, etc., and then uses them as in- two points. First, a user-requested service has a
put to the deep learning model to output whether high degree of repeatability; that is, the same ap-
and how to offload at the current moment. At plication is downloaded and run on thousands of
present, learning-based methods mainly use two different mobile devices by thousands of users in
basic strategies: DNNs, which are usually used to the application store. By deploying these services
construct classifiers, and DRL, which has excellent in a mobile edge network, mobile users can eas-
performance in decision-making. ily offload their MDLA data to edge servers under
In [97], the authors propose a novel mecha- good network conditions, which can greatly reduce
of
nism for optimizing offloading performance by us- the MDLA’s execution latency. Whether the ser-
ing crowd-sensed evidence traces and constructing vice can be cached in a certain edge server directly
ro
a DNN offloading-decision classifier. They believe determines whether the user in its coverage re-
that for MDLAs, data from one device is obviously gion can offload their computing to the edge. Sec-
not enough to quantify individual factors affecting
offloading due to their inherent complexity and di-
versity. Huber Flores et al. aggregate samples from
-p
ond, in MAR applications and many other video
image MDLAs, similar video content may be re-
peatedly requested by many users. Because video
re
a larger community of devices and design an evi- transmission takes up a large bandwidth, if we
dence analyzer using a DNN to identify times ben- take the video from the CDN for each request, re-
lP
eficial for offloading by analyzing evidence traces peated content transmission will cause great band-
collected through crowdsensing. width waste. Therefore, we should adopt an intel-
Duc Van Le et al. propose a Deep Reinforce- ligent cache strategy in a mobile network to enable
ment Learning (DRL)-based offloading scheme mobile users to obtain the content from a nearby
na
that enables users to make near-optimal offloading cache, which could significantly reduce the data
decisions under consideration of uncertainties of accessing time of the MDLA and greatly eliminate
user and cloudlet movements and cloudlet resource the influence of the network connection dynamics.
ur
availabilities. They propose a Markov Decision It has been proved that caches in 3G mobile net-
Process (MDP)-based offloading problem formu- works and 4G LTE networks can reduce mobile
Jo
lation and then use a deep reinforcement learning traffic by 1/3 to 2/3 [99]. In addition, the energy
scheme called a Deep Q-Network (DQN) to learn efficiency of the 4G network can be improved. The
an efficient solution for the proposed MDP-based evolution of the green 5G network can be effec-
offloading problem [82]. tively promoted by the intelligent caching of popu-
In addition to making an appropriate offloading lar content to reduce traffic load.
decision, we can improve the effectiveness of of-
floading by special offloading means. For exam- 3.4.1. Cache content
ple, Wasiur et al. use a queuing theoretic descrip- For MDLAs, there are two kinds of content to
tion of a collaborative uploading scenario, split be cached at the edge network: deep models of
data into chunks and offload them over multiple MDLAs, which we call services, and the MDLA
paths; finally, these chunks are merged at the desti- input data. Caching services requested by a large
nation [98]. This method can reduce the network number of users on the edge network enable mo-
transmission delay significantly and can be gen- bile users to offload their corresponding computa-
erally applied to other offloading work. This ap- tion, and the benefit depends on the popularity of
proach is a special offloading technique rather than the cached services. MDLA input data is generally
a black box that generally uses the network state data types with large transmission bandwidths such
as an input and outputs an offloading decision. We as videos, images and common data.
can also consider other special offloading methods a) Service
for each field, although this may be unusual and not A service cache is a deep model of the cache in
suitable for general work; however, for offloading an edge server or nearby computable device and
A survey on deploying mobile deep learning applications: a systemic and technical perspective 17
its associated databases, which allows users to of- interference on edge servers[102].
fload the corresponding computing tasks on the Second, for non-real-time data, in many dis-
edge. Since we can only cache a limited number tributed deployment MDLAs, users need to ex-
of MDLA services in resource-constrained edge change data frequently with the server, or many
servers at the same time, we must carefully decide users may request the same multimedia content
which services to cache to maximize the profit of from the CDN repeatedly. In traditional cloud-
offloading for the overall system. The services that based architecture, content is usually obtained
we cache on the edge server decide which tasks the from the central data storage center, far away from
user can offload. If we cache the most popular ser- users, which is not suitable for the characteristics
vice, the system may obtain the maximum perfor- of frequent data exchange and a large amount of
mance benefit. access requests. It produces considerable delay
Xu et al. formalize the joint service caching and and influences usersâĂŹ QoS. Therefore, in recent
task-offloading problem in MEC-enabled dense years, it has become increasingly popular to cache
cellular networks to minimize computation latency data at places close to users, such as edge servers
of
under a long-term energy consumption constraint. or ad hoc devices that are nearby.
They develop a novel online algorithm named Zhang et al. propose that more bandwidth is re-
ro
OREO to perform stochastic service caching on- quired for VR video applications to achieve high
line without requiring future information[55]. temporal and spatial fidelity content. They design
Wang et al. focus on mobile VR applications
with the support of online social networks. They
divide VR applications into two parts: service en-
-p
a VR video delivery system based on Named Data
Networking (NDN) and proposed an integrating
hotspot-based and popularity-based caching pol-
re
tities on servers and client entities on mobile de- icy to cache the content that is most likely to be
vices. They define the Edge Service Entity Place- requested to reduce the transmission delay of VR
lP
ment problem (ESEP) as the problem of deciding videos and enhance user experience[103].
where to place the SE of each user among the edge Hao et al. study knowledge-centric proac-
servers to maximize the economic profits of the tive edge caching in mobile content distribution
edge servers as well as achieve the desired level networks. The high dynamics of mobile video
na
of QoS for users, and they propose an iterative al- streams and complex user playback behaviors
gorithm named ITEM to solve this problem[100]. make it difficult to decide which content should
b) Data be cached through popularity-based investigations
ur
The input data of MDLAs can be divided into or probability-based predictions. This work opti-
two types: (i) real-time data acquired by hardware mizes the caching configuration based on seman-
Jo
on mobile devices such as images for object clas- tic information of the online playback behavior
sification; and (ii) offline data acquired from the of 5G multimedia service users. They mathemat-
content provider in the central storage center, for ically formulate this NP-complete caching opti-
example, multimedia data to support VR/AR and mization problem and propose a greedy-based on-
panoramic views. Distributed data caching works line caching configuration algorithm to minimize
only for offline data, not for real-time data. There- the overall delivery cost of video streaming and the
fore, for real-time data, we discuss data compres- maximum edge caching utilization ratio[104].
sion and transmission; for offline data, we discuss Mohan et al. propose an efficient edge caching
distributed caching. mechanism leveraging edge resources to predict
First, for real-time data, the operation is offload- and store data required for upcoming computa-
ing, and the main problem is limited, dynamic tions. Their solution is to group caches according
wireless bandwidth. In addition, we have to face to the workloads of different services. They further
the reality that most depth models are very sen- develop methods for populating caches and ensur-
sitive to data noise [101], so we need to offload ing the coherence of cached data[105].
high-quality data. Xie and Kim developed a DNN-
aware basic data compression framework named 3.4.2. Cache location
"GRACE" to compress the real-time image and From ad hoc devices to remote clouds, there are
video data acquired on IoT devices, which reduces many cache locations. Considering the character-
the network bandwidth consumption for data trans- istics of cellular networks based on full IP, we can
mission without affecting the performance of DNN divide them into three main storage locations: EPC
18 Wang Yingchun, et al.
core networks, Radio Access Networks (RANs) easier, recently, many works prefer pushing con-
and ad hoc devices. However, when deciding the tent cache to edges closer to users. Especially
caching location, we need to consider a basic prob- in the emerging 5G network, base stations (BS)
lem: although a closer cache reduces the redun- are naturally equipped with edge servers (such as
dant transmission of identical content in the rest of Nvidia Jetson TX21) and provide storage capa-
the network and releases the core, most networks bilities for cache services. By caching appropri-
are organized according to a tree distribution hi- ate content at the nearby edge, viewers can obtain
erarchy. And the closer the caching location is to the target content locally instead of from a remote
the user, the fewer users it covers. There are fewer CDN server, which not only provides better QoE
users served by edge servers than by more central with lower latency, but also saves core network
cloud storage, and if no user requests data at a cer- traffic costs.RAN caching typically cache content
tain cache point, it may not be necessary to actively in the eNB and it is mainly divided into two cate-
bring replicated content to the edge point. There- gories:
fore, intelligent selection of cache locations and Macro base station: a macro base station has a
of
optimization of content placement are required. large coverage to serve more users and has rich
In Fig 2, we show the transmission of content in storage resources for a better cache hit ratio. In the
ro
four cases: no cache, using a core network cache, work of Gu [106], the authors analyze the caching
using the wireless access network cache and using distribution problem in a macro base station as
an ad hoc device cache with D2D communication.
In the case of "no cache", every content request
needs to be transmitted through a complex network
-p
an NP-hard problem and propose a heuristic algo-
rithm to solve it.
Micro base station: compared with a macro
re
to retrieve content from a remote ISP, resulting in base station, a micro base station has less storage
great storage redundancy and transmission delays. space and a smaller coverage, so it may have a
lP
After adding the core network cache, the commu- lower cache hit ratio in terms of the diversity of
nication between the ISP and the core network de- cache content. However, micro base stations bring
vice can be reduced somewhat; after adding the greater flexibility, and more importantly, cooper-
RAN cache, the communication traffic between the ative content sharing between micro base stations
na
access network and the core network can be signif- can jointly optimize users’ requests to improve the
icantly reduced; if the device cache and D2D com- cache hit rate. In addition, one of the greatest ad-
munication are further added, the transmission de- vantages of micro base stations over macro base
ur
lay can be further reduced. stations is that they are closer to the user, so they
a) Core network can bring smaller delays to reduce the impact of
Jo
of
ro
-p
re
(a) (b) (c) (d)
lP
Fig. 2: Different content transfer requests due to different cache location.From left to right: No cache, Core network caching, Edge
network caching, Ad hoc cloudlet and D2D link
na
nodes or belong to users with similar characteris- a limited cache and can transmit content to each
tics and use D2D communication to assist its trans- other through D2D communication; and (iii) the
mission. central server transmits the content to mobile users
ur
Aravindh Ramane et al. cache data on fam- and then transmits it to other users. Their goal is to
ily auxiliary nodes or mobile devices of social- decide which content to cache on the end devices
Jo
related users and then connect them together in a to minimize service cost[110].
distributed way to realize content sharing. They Zhang et al. consider the QoS of a two-hop
design an edge-caching architecture named "Wi- wireless connection with a delay constraint in a
Stitch", which is an "edge-stitching" distributed multimedia big data offloading architecture. When
content transmission network[108]. In their recent two mobile users request the same multimedia data
work [109], the Wi-Stitch is extended by solving content, one downloads data from a BS and uses
two main problems: (i) the shared node may not D2D to forward data to another. The authors pro-
have enough bandwidth to share content associ- pose three optimal single-hop transmission power
ated with the limited Wi-Fi AP; and (ii) Wi-Stitch allocation schemes to solve the problem of sup-
may produce multiple copies of popular content porting this double-hop wireless link transmission
but insufficient copies of less popular content. The while ensuring the bounded QoS requirements of
authors formulate these as optimization problems two single hops[111].
and solve them by strategically placing content for
sharing within a geographically localized cell. 3.4.3. Cache policy
Akshay Mete and Sharayu Moharir combine a The cache strategy determines the caching con-
central server with multiple end-users. They form tent and locations as well as the time when to re-
a content delivery system that supports three con- lease its storage. Making a perfect cache scheme
tent delivery modes: (i) the central server stores the is one of the keys to improving the performance
entire content catalog and delivers the requested of MDLAs. It is important to estimate the benefits
content to mobile users; (ii) mobile devices have of caching certain content by evaluating its current
20 Wang Yingchun, et al.
popularity, potential popularity, storage size, and the viewpoint of users moves with the movement
the location of its existing copies in the network of the football. In the game, most players are only
topology. It is more challenging than traditional concerned about their own situation. These areas
cache strategies, including LRU, LFU, FIFO, etc., of interest, or video clips, are defined as hot spots
and its goal is to improve the cache hit rate and and should be prioritized in caching. Sometimes,
reduce the network transmission bandwidth con- there are multiple hot spots in a view. For exam-
sumption. ple, the work of [103] models the attraction of all
a) Popularity-based viewpoints in a VR view to cache the content that
Caching content with higher popularity can is most likely to be requested.
maximize the total QoE of all users within a cer- c) User preference-based
tain region. The popularity of content is defined The user preference profile includes the proba-
as the ratio of the number of requests for specific bilities of a specific user requesting each content
content to the total number of user requests. It is over a certain period of time, and there are signif-
restricted to a certain area in a certain period[112]. icant differences among different individuals. This
of
However, it is worth noting that the popularity of is because users usually have a strong preference
content is not static; it follows a Zipf distribution, for specific content categories. Users’ preferences
ro
which is a power-law distribution[113]. Therefore, can be predicted according to the historical content
it is necessary to update the popularity of content requirements of users and the similarity between
in a popularity-based cache strategy.
Zhang et al. design a data structure to record
the content popularity of each router. In addition,
-p
users. This information can be widely used in rec-
ommendation systems. Because of the character
of preference, user numbers under a certain prefer-
re
each router needs to communicate with the oth- ence category will not be too large, so it is suitable
ers to calculate global popularity information[103]. to be applied to cache servers with small coverage,
lP
The work of [114] records the global popularity such as SBs or home cloudlets.
of every video segment at every viewpoint. Large d) Learning-based
popularity means that this video segment has been First, content popularity is region-specific and
requested many times by users, so the cache of the not fixed, so it is difficult to capture. Second,
na
segment is even more meaningful. in most cases, the content we cache is a video
The work of [115] analyzes the dynamic adapt- stream that faces highly dynamic and complex user
ability of popularity in the cache algorithm from playback behavior. Therefore, a learning-based
ur
two aspects: (i) learning the accuracy of fixed pop- caching policy using knowledge of content demand
ularity distribution; and (ii) learning the changing history is very promising. For instance, in the work
Jo
speed of popularity for certain content. Based on of [78], the authors use multiagent reinforcement
both of these aspects, they propose a novel hybrid learning to design content cache policy in mobile
algorithm to learn popularity changes faster and D2D networks without the need for acquiring real-
better. time requirements and popularity. They propose a
Another important fact is that the distribution of belief-based Modified Combinatorial Upper Con-
content popularity in a large area is often differ- fidence Bound (MCUCB) algorithm to solve the
ent from the distribution in a small area. There- problem of large joint action.
fore, the measurement of content popularity faces Hao et al. implement a cache policy based on
the difficulty of spatial granularity knowledge, not user playback behavior[104]. They use deep be-
only because the coverage of various types of edge lief networks to capture the semantic information
servers is different but also because the users are of users and infer the video that will be requested
in dynamic flow between multiple units. This will in the future based on the user’s playback mode.
also have an impact on the prediction of content The video is actively cached in the edge network.
popularity, especially for edge servers in SBs with
a small coverage. 3.5. Summary of this chapter
b) Hot spot-based This section investigates the distributed deploy-
In many multimedia MDLAs such as VR/AR, ment scheme of MDLAs from three perspectives:
video streaming services, and real-time interactive deployment architecture, offloading decisions and
games, the user usually looks at the most attrac- distributed caches. The main work is summarized
tive viewpoint. For example, in a football game, in Table 3:
A survey on deploying mobile deep learning applications: a systemic and technical perspective 21
of
data privacy
Remote cloud [50, 81] Remote cloud networks are rich in resources but
ro
Offloading have large transmission delays
location
Edge network [56, 82, 116] Network resources are not as great as those of
Offloading
SS-MU [52–54] Server resource allocation and radio channel
mode
contention issues
na
resource sharing
Cache Services [55, 100] Caching app services and related databases in
content edge servers
Data [103, 105] Cacheing data frequently requested by users,
particularly for video
Core network [99] Caching at the cloud with rich memory,but with
Cache Cache far distance, and large transmission delay
locations
RAN [99, 106] Caching at the eNB or edge servers nearby, lead-
ing faster response speed and lower latency
Peer devices [108, 110, 111] Caching data closer users,further reducing trans-
mission waiting
Popularity [103, 112–115] Uses the probability of certain content being re-
Cache quested in a certain period of time, which is re-
policy lated to the specific region. The cache of content
with greater popularity is more important
Hot spot [103] The highest-interest point of users in images
or videos; caching high-quality hot-spot content
can improve usersâĂŹ experience
Preferences of The user typically has a strong preference for a
users particular content category; we need to predict
and cache the content that individuals prefer
Learning [104] Learning-based caching strategy with content-
requesting history knowledge, which has high
dynamic adaptability
22 Wang Yingchun, et al.
of
vices or edge servers. If the calculation is complex, analysis crowdsources multimedia data to locate
it can be further offloaded to the remote cloud. criminals; traffic forecasting collects complaints
ro
This deep learning is lightweight, portable, and from a large number of mobile users about acci-
close to users. User-oriented MDLAs center intel- dents and congestion on the ground in the early
ligence around the user. For example, the real-time
video or image acquired by a mobile-side camera
provides support for deploying object recognition
-p
peak period so that service providers can pro-
vide more accurate real-time traffic condition re-
ports and obtain better economic benefits. (ii) Dis-
re
and tracking technology on the mobile terminal so tributed deep learning tasks [22–24]: Distributed
that the mobile device has vision, which can be deep learning on mobile devices to make more ef-
lP
used in tasks such as face recognition for user au- ficient and convenient use of the large quantity of
thentication and local video analysis. We can use data generated by mobile terminals not only solves
the camera on a mobile device to collect lip mo- the problem of large data sets and large models in
tion video, and the lip-reading information can be the traditional centralized training mode but also
na
used not only for deaf and mute information input effectively aims at the problem of data privacy of
but also for lip reading authentication [3]. Aug- mobile terminal users. (iii) There are also vari-
mented reality and virtual reality technology pro- ous internet of things applications [26] and appli-
ur
vide a virtual superhuman vision for us, and it has cations that use mobile big data to develop various
attracted much interest from academia and indus- services [27, 28].
Jo
continuous feedback. Therefore, first, we should and poor network conditions, some users may pre-
consider how to enrich mobile information acqui- fer to run their applications locally on the mobile
sition devices, which are not limited to existing end. Besides, offloading itself is also an energy-
cameras, microphones, temperature sensors, etc., consuming work. So, how to improve the en-
to obtain more comprehensive feature-dimensional durance of mobile devices is a key problem. We
data. Second, how to improve the accuracy of the can study it by reducing the energy consumption
data acquired by mobile information acquisition of MDLAs and improving the battery capacity of
devices, reduce the quantity of dirty data and im- mobile devices.
prove the ability to acquire accurate data in poor
environments are also areas where mobile infor-
5.2. Data management
mation acquisition devices need to be improved.
Third, the heterogeneity of sensor quality in var- 5.2.1. Data personalized management
ious devices is also one of the key issues [120]. On Smart phones have become the main computing
the one hand, mobile devices are usually equipped
of
platform for millions of people. They also repre-
with nonprofessional sensors. The sensor quality sent a new set of input devices, millions of cam-
of different devices may be uneven, which leads to eras, microphones, GPS devices and many other
ro
the uneven quality of sensor data obtained. For ex- types of sensors that generate massive data at every
ample, in a third-party mobile applications, this has moment. With the increase in the number of con-
a great impact on the accuracy of deep tasks. On
the other hand, the workload of the mobile system
is unpredictable, which may result in different sam-
-p
nected devices including mobile phones, tablets
and laptops, there is an urgent need for personal-
ized management of mobile user data. There are
re
pling rates in different time periods, so the quality two main questions: The first is how to store mas-
of sensor data may be unstable over time. Work has sive data. First, these data cannot be completely
lP
been done [121] to solve these problems, but there stored in mobile devices. The limited storage ca-
are still some shortcomings, such as the execution pacity of a mobile device can only support it in
time and energy consumption of the whole opti- storing a small quantity of running data and per-
mization framework on mobile devices. In terms of
na
bile device market, it can be seen that mobile de- such as domestic microclouds and other mobile de-
vices tend to be smaller, thinner, more portable, vices. How can massive data be managed in a per-
while reducing their size. All of these factors limit sonalized way? For users, it is convenient for data
the size of the device hardware, such as the CPU, acquisition, data consistency, data recovery, data
heat sink area and memory chip, which restricts the updating and cleaning, etc. For enterprises, in core
performance of MDLAs. Improving the computing data mining, marketing, maintenance of member
and storage ability of mobile devices, similar to the service data, etc., there are many factors to be con-
introduction of deep learning chipsets and GPUs in sidered. Because of the massive size and frequent
mobile devices, is one of the key development di- updates of data, a more detailed design is needed
rections of mobile devices in the future. in MDLA data management.
5.1.3. Mobile device power consumption 5.2.2. Data privacy and security
As is well known, the computation of most MD- Offloading user data to the background will in-
LAs is very powerful, especially for those vision evitably face data privacy and security problems.
applications. While most of the battery capacity of First, for highly private data, privacy issues in com-
mobile devices is restricted and not enough to sup- munication, such as eavesdropping, may occur in
port complete local operation. Although we can the process of offloading. Second, there is a prob-
offload it to the background for high-performance lem of how to protect the privacy of user data when
computing, sometimes, due to high data privacy offloading the user data to the background server.
24 Wang Yingchun, et al.
Third, massive data have brought great value, in- large-scale MIMO systems, more intelligent de-
cluding a large number of data models and infor- vices, and Machine-to-Machine (M2M) communi-
mation; this raises the question of how to obtain cation [122]. We will discuss the communication
the user’s consent to use them and how to protect challenges of MDLAs based on these five tech-
the user’s privacy while using these data with the nologies.
user’s consent. The former may require the design a) Device-centered architecture: In the past, as
of nontechnical issues such as reward mechanisms. the basic unit of the wireless access network, the
The latter requires the design and support of data cell play an important role in controlling the uplink
encryption, feature abstraction and other technical and downlink transmission of data services. How-
issues. ever, recently, the focus has gradually moved from
core networks to peripheral devices, and the tradi-
5.3. System and network tional cell-centric architecture has been destroyed.
5.3.1. Distributed system for MDLAs We need to redefine the network architecture of the
Most offloading of MDLAs is related to the work new era, and we face some challenges. First, we
of
of distributed computing and caching. The tradi- study an ultra-dense heterogeneous network: MD-
tional design of highly available distributed sys- LAs make the density of heterogeneous access in
ro
tems usually needs to achieve redundancy, state mobile cellular units rapidly increase, the simple
synchronization, resource scheduling, system self- and single communication network architecture is
inspection, fault recovery, convenient scaling, etc.
In the process of offloading MDLAs, we need to
consider not only the typical factors above but
-p
not enough to meet intensive and diversified user
needs. The design of communication is now af-
fected by the type of popular MDLAs in this area
re
also the characteristics of mobile devices and deep and user’s mobility, and the coordination compen-
learning models, as well as the challenges brought sation between different layers of the network ar-
lP
by the diversity of offloading locations. Mobile de- chitecture also needs to be considered; In addi-
vices have mobility and rely on unstable wireless tion, with the rapid increase of base station density,
networks and cellular connections, which makes it achieving more flexible adaptive resource schedul-
difficult for offloading to achieve high fault toler- ing between them for MDLAs also urgently needs
na
ance and a stable state. It also affects making of- to be solved. Secondly, MDLAs also need 5G tech-
floading and cache decisions by changing devices’ nology with strong connectivity and highly inten-
location among multiple servers. This raises the sive deployment.
ur
questions of (i) how to model the spatiotemporal b) Communication technology: (1) Millime-
characteristics of users; (ii) how to cache the user’s ter waves: 5G has drawn attention to millimeter-
Jo
content in a mobile-aware way; (iii) how to im- wave, which brings greater bandwidth, richer spec-
prove the hit rate of the edge cache when users re- trum resources, more high-frequency antennas and
quest cached content; and (iv) how to ensure the higher propagation accuracy. However, millimeter
continuity of services when the mobility of users is waves are easily affected by the environment, and
unknown. All of these questions have to be con- the propagation distance is short, so we need more
sidered after the introduction of mobility. Whats’s technologies to improve signal anti-interference
more, now we have to consider the deep learning abilities and reduce path loss. (2) Communica-
model’s distributed training and parallel inference. tion between mobile devices: To share content and
jointly conputation between mobile devices wire-
5.3.2. Advancing communication lessly, more efficient D2D communication need to
As content and computing migrate to the edge be developed. Efforts need be made to design user-
side, the vigorous development of MDLAs shows sharing schemes, including hardwares and content.
the characteristics of low delay and high reliabil-
ity in computing and distributed and high band- 5.4. New application types
width in content. This poses the development People always need more types of MDLAs and
requirements of ultra-large bandwidth, ultra-large new mobile devices. To start with, the recent App
connection, ultra-reliability and low delay for new market has shown us its unlimited possibilities.
communication networks. Bocardi’s work has For example, Apps that run on traditional mo-
identified five key technologies of 5G: device- bile devices (including but not limited to rec-
centered architecture, millimeter-wave technology, ommendations from nearby smart friends) are
A survey on deploying mobile deep learning applications: a systemic and technical perspective 25
voice recognition modules that allow users to is- 61472317, and 61502379, the MOE Innovation
sue voice commands in social software. Besides, Research Team No. IRT 17R86, and the Project of
the combination of visual services and DL re- China Knowledge Centre for Engineering Science
sults in a super-visual service: auto-beatifying for and Technology.
multi-media data, 360-degree panoramic transmis- [1] L. Wei, W. Luo, J. Weng, Y. Zhong, X. Zhang, Z. Yan,
sion, viewpoint HD, super-resolution reconstruc- Machine learning-based malicious application detec-
tion, post-occlusion visual extension, visual au- tion of android, IEEE Access 5 (2017) 25591–25601.
thentication, etc. Moreover, intelligence in online doi:10.1109/ACCESS.2017.2771470.
[2] S. Xu, L. Zhang, A. Li, X. Y. Li, C. Ruan, W. Huang,
shopping can be applied to building a user im- Appdna: App behavior profiling via graph-based deep
age for commodity recommendation, false goods, learning, in: 2018 IEEE Conference on Computer
poor seller analysis[123], etc. In addition, some Communications, INFOCOM 2018,Honolulu, HI, USA,
April 16-19, 2018, 2018, pp. 1475–1483.
novel applications on new types of mobile de-
[3] L. Lu, J. Yu, Y. Chen, H. Liu, Y. Zhu, L. Kong, M. Li, Lip
vices appear in recent years, such as the wear- reading-based user authentication through acoustic sens-
able mobile devices[124], AR/VR glasses, smart ing on smartphones, IEEE/ACM Transactions on Net-
of
tableware[12], and driverless cars[125]. In fact, working 27 (1) (2019) 1–14.
[4] B. Zhou, J. Lohokare, R. Gao, F. Ye, Echoprint:two-
these apps are not enough. Human social activities,
ro
factor authentication using acoustics and vision on
work activities, physical activities as well as sen- smartphones, in: Proceedings of the 24th Annual In-
sory activities such as vision, hearing, smell, taste, ternational Conference on Mobile Computing and Net-
working, MobiCom 2018, New Delhi, India, October
touch and vision, hearing, smell, taste, tactile, and
vision, will be combined with mobile deep learn-
ing in the future. The new design of MDLAs will
-p 29-November 02, 2018, 2018, pp. 321–336.
[5] B. Fang, X. Zeng, M. Zhang, Nestdnn: Resource-aware
multi-tenant on-device deep learning for continuous mo-
re
make our lives much easier. bile vision, in: Proceedings of the 24th Annual Interna-
tional Conference on Mobile Computing and Network-
ing, MobiCom 2018, New Delhi, India, October 29 -
lP
ing MDLAs. One way is to execute them locally bile Computing and Networking(2018), MobiCom ’18,
on mobile devices. The main methods include (i) ACM, New York, NY, USA, 2018, pp. 129–144.
[7] J. Ren, L. Gao, H. Wang, Z. Wang, Optimise web brows-
reducing complexity by improving the deep learn- ing on heterogeneous mobile platforms: A machine
ur
ing algorithm or by redesigning the model archi- learning based approach, in: 2017 IEEE Conference on
tecture to be suitable for mobile terminals; (ii) Computer Communications, INFOCOM 2017,Atlanta,
reusing intermediate results of deep models to re- GA, USA, May 1-4, 2017, 2017, pp. 1–9.
Jo
[12] Q. Huang, Z. Yang, Q. Zhang, Smart-u: Smart utensils and Statistics, AISTATS 2017, 20-22 April 2017, Fort
know what you eat, in: 2018 IEEE Conference on Com- Lauderdale, FL, USA, Vol. 54 of Proceedings of Ma-
puter Communications, INFOCOM 2018,Honolulu, HI, chine Learning Research, PMLR, 2017, pp. 1273–1282.
USA, April 16-19, 2018, 2018, pp. 1439–1447. [26] L. He, K. Ota, M. Dong, Learning iot in edge: Deep
[13] T. Zhao, J. Liu, Y. Wang, H. Liu, Y. Chen, Ppg-based learning for the internet of things with edge computing,
finger-level gesture recognition leveraging wearables, IEEE Network 32 (1) (2018) 96–101.
in: 2018 IEEE Conference on Computer Communica- [27] Y. Chen, L. Shu, L. Wang, Poster abstract: Traffic
tions, INFOCOM 2018,Honolulu, HI, USA, April 16-19, flow prediction with big data: A deep learning based
2018, 2018, pp. 1457–1465. time series model, in: 2017 IEEE Conference on Com-
[14] M. Cheung, J. She, L. Liu, Deep learning-based on- puter Communications Workshops, INFOCOMWork-
line counterfeit-seller detection, in: IEEE INFOCOM shops, Atlanta, GA, USA, May 1-4, 2017, 2017, pp.
2018 - IEEE Conference on Computer Communications 1010–1011.
Workshops, INFOCOM Workshops 2018, Honolulu, HI, [28] Y. Hou, P. Zhou, J. Xu, D. O. Wu, Course recommenda-
USA, April 15-19,2018, 2018, pp. 51–56. tion of mooc with big data support: A contextual online
[15] Y. Zou, G. Wang, K. Wu, L. M. Ni, Smartsensing: Sens- learning approach, in: IEEE INFOCOM 2018 - IEEE
ing through walls with your smartphone!, in: 11th IEEE Conference on Computer Communications Workshops
International Conference on Mobile Ad Hoc and Sensor (INFOCOM WKSHPS), 2018, pp. 106–111.
of
Systems,MASS 2014, Philadelphia, PA, USA, October [29] L. N. Huynh, Y. Lee, R. K. Balan, Deepmon: Mobile
28-30, 2014, 2014, pp. 55–63. gpu-based deep learning framework for continuous vi-
[16] T. Meng, X. Jing, Z. Yana, W. Pedrycz, A survey on sion applications, in: Proceedings of the 15th Annual
ro
machine learning for data fusion, Information Fusion 57 International Conference on Mobile Systems, Applica-
(2020) 115–129. tions, and Services, MobiSys ’17, ACM, New York, NY,
[17] J. Wang, X. Jing, Z. Yan, Y. Fu, W. Pedrycz, L. T. Yang, USA, 2017, pp. 82–95.
A survey on trust evaluation based on machine learning,
ACM Comput. Surv. 53 (5).
[18] Z. Lu, C. K. S., P. Shiliang, L. P. Tom, Crowdvision:
-p
[30] S. Han, J. Pool, J. Tran, W. J. Dally, Learning both
weights and connections for efficient neural networks,
in: Proceedings of the 28th International Conference
re
A computing platform for video crowdprocessing using on Neural Information Processing Systems - Volume 1,
deep learning, IEEE Transactions on Mobile Computing NIPS’15, 2015, pp. 1135–1143.
PP (99) (2018) 1–1. [31] S. Anwar, K. Hwang, W. Sung, Structured pruning of
lP
[19] Y. Tian, W. Wei, Q. Li, F. Xu, S. Zhong, Mobi- deep convolutional neural networks, J. Emerg. Technol.
crowd: Mobile crowdsourcing on location-based so- Comput. Syst. 13 (3) (2017) 32:1–32:18.
cial networks, in: 2018 IEEE Conference on Computer [32] F. Moya Rueda, R. Grzeszick, G. A. Fink, Neuron prun-
Communications, INFOCOM 2018,Honolulu, HI, USA, ing for compressing deep networks using maxout archi-
na
April 16-19, 2018, 2018, pp. 2726–2734. tectures, in: V. Roth, T. Vetter (Eds.), Pattern Recogni-
[20] Q. Xu, R. Zheng, When data acquisition meets data ana- tion, Springer International Publishing, Cham, 2017, pp.
lytics: A distributed active learning framework for op- 177–188.
timal budgeted mobile crowdsensing, in: 2017 IEEE [33] W. Chen, J. T. Wilson, S. Tyree, K. Q. Weinberger,
ur
Conference on Computer Communications, INFOCOM Y. Chen, Compressing neural networks with the hashing
2017,Atlanta, GA, USA, May 1-4, 2017, 2017, pp. 1–9. trick, Computer Science (2015) 2285–2294.
[21] S. He, K. G. Shin, Steering crowdsourced signal map [34] Z. Li, B. Ni, W. Zhang, X. Yang, G. Wen, Performance
Jo
construction via bayesian compressive sensing, in: 2018 guaranteed network acceleration via high-order residual
IEEE Conference on Computer Communications, IN- quantization, in: 2017 IEEE International Conference on
FOCOM 2018,Honolulu, HI, USA, April 16-19, 2018, Computer Vision (ICCV), 2017.
2018, pp. 1016–1024. [35] Z. Liu, M. Sun, T. Zhou, G. Huang, T. Darrell, Rethink-
[22] T. Tuor, S. Wang, T. Salonidis, B. Ko, K. K. Le- ing the value of network pruning, in: International Con-
ung, Demo abstract: Distributed machine learning at ference on Learning Representations, 2019.
resource-limited edge nodes, in: IEEE INFOCOM [36] S. Han, H. Shen, M. Philipose, S. Agarwal, A. Wolman,
2018 - IEEE Conference on Computer Communications A. Krishnamurthy, Mcdnn: An approximation-based ex-
Workshops, INFOCOM Workshops 2018, Honolulu, HI, ecution framework for deep stream processing under re-
USA, April 15-19,2018, 2018, pp. 1–2. source constraints, in: Proceedings of the 14th Annual
[23] S. Wang, T. Tuor, T. Salonidis, K. K. Leung, C. Makaya, International Conference on Mobile Systems, Applica-
T. He, K. Chan, When edge meets learning: Adap- tions, and Services, MobiSys ’16, ACM, New York, NY,
tive control for resource-constrained distributed ma- USA, 2016, pp. 123–136.
chine learning, in: 2018 IEEE Conference on Computer [37] P. Guo, B. Hu, R. Li, W. Hu, Foggycache: Cross-
Communications, INFOCOM 2018,Honolulu, HI, USA, device approximate computation reuse, 2018, pp. 19–34.
April 16-19, 2018, 2018, pp. 63–71. doi:10.1145/3241539.3241557.
[24] Y. Bao, Y. Peng, C. Wu, Z. Li, Online job scheduling [38] A. Mathur, N. D. Lane, S. Bhattacharya, A. Boran,
in distributed machine learning clusters, in: 2018 IEEE C. Forlivesi, F. Kawsar, Deepeye: Resource efficient lo-
Conference on Computer Communications, INFOCOM cal execution of multiple deep vision models using wear-
2018,Honolulu, HI, USA, April 16-19, 2018, 2018, pp. able commodity hardware, in: Proceedings of the 15th
495–503. Annual International Conference on Mobile Systems,
[25] B. McMahan, E. Moore, D. Ramage, S. Hampson, B. A. Applications, and Services, MobiSys’17, Niagara Falls,
y Arcas, Communication-efficient learning of deep net- NY, USA, June 19-23, 2017, 2017, pp. 68–81.
works from decentralized data, in: Proceedings of the [39] Y. Geng, Y. Yang, G. Cao, Energy-efficient com-
20th International Conference on Artificial Intelligence putation offloading for multicore-based mobile de-
A survey on deploying mobile deep learning applications: a systemic and technical perspective 27
vices, in: INFOCOM 2018 - IEEE Conference [58] Y. Mao, C. You, J. Zhang, K. Huang, K. B. Letaief, A
on Computer Communications, Proceedings - IEEE survey on mobile edge computing: The communication
INFOCOM, Institute of Electrical and Electronics perspective, IEEE Communications Surveys and Tutori-
Engineers Inc., United States, 2018, pp. 46–54. als 19 (4) (2017) 2322–2358.
doi:10.1109/INFOCOM.2018.8485875. [59] P. Mach, Z. Becvar, Mobile edge computing: A survey
[40] Google, Tensorflow lite, https://tensorflow. on architecture and computation offloading, IEEE Com-
google.cn/lite/guide. munications Surveys and Tutorials 19 (3) (2017) 1628–
[41] Facebook, caffe2, https://caffe2.ai/blog/2018/ 1656.
05/02/Caffe2_PyTorch_1_0.html. [60] M. S. Dao, M. S. Dao, V. Mezaris, F. G. B. D. Na-
[42] Apple, Core ml2, https://developer.apple.com/ tale, Deep learning for mobile multimedia: A survey,
documentation/coreml. Acm Transactions on Multimedia Computing Commu-
[43] Tecent, Ncnn, https://github.com/Tencent/ncnn. nications and Applications 13 (3s) (2017) 1–22.
[44] Tecent, Feathercnn, https://github.com/Tencent/ [61] K. Kumar, J. Liu, Y.-H. Lu, B. Bhargava, A survey of
FeatherCNN. computation offloading for mobile systems, Mobile Net-
[45] Qualcomm, Snpe, https:// works and Applications 18 (1) (2013) 129–140.
developer.qualcomm.com/software/ [62] D. Li, X. Wang, D. Kong, Deeprebirth: Accelerating
qualcomm-neural-processing-sdk. deep neural network execution on mobile devices.
of
[46] Xiaomi, Mace, https://github.com/XiaoMi/mace. [63] S. Han, H. Mao, W. J. Dally, Deep compression: Com-
[47] Amazon, Amazon ec2, https://aws.amazon.com/ pressing deep neural networks with pruning, trained
de/ec2/. quantization and huffman coding.
ro
[48] T. Y.-H. Chen, L. Ravindranath, S. Deng, P. Bahl, H. Bal- [64] P. Yin, J. Lyu, S. Zhang, S. J. Osher, Y. Qi, J. Xin, Under-
akrishnan, Glimpse: Continuous, real-time object recog- standing straight-through estimator in training activation
nition on mobile devices, in: Proceedings of the 13th quantized neural nets, in: 7th International Conference
ACM Conference on Embedded Networked Sensor Sys-
tems, SenSys ’15, ACM, New York, NY, USA, 2015, pp.
155–168.
-p on Learning Representations, ICLR 2019, New Orleans,
LA, USA, May 6-9, 2019, 2019.
[65] F. N. Iandola, M. W. Moskewicz, K. Ashraf, S. Han,
re
[49] P. Jain, J. Manweiler, R. Roy Choudhury, Low band- W. J. Dally, K. Keutzer, Squeezenet: Alexnet-level accu-
width offload for mobile ar, in: Proceedings of the 12th racy with 50x fewer parameters and <0.5mb model size,
International on Conference on Emerging Networking CoRR abs/1602.07360.
lP
EXperiments and Technologies, CoNEXT ’16, ACM, [66] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko,
New York, NY, USA, 2016, pp. 237–251. W. Wang, T. Weyand, M. Andreetto, H. Adam, Mo-
[50] R. Shea, A. Sun, S. Fu, J. Liu, Towards fully offloaded bilenets: Efficient convolutional neural networks for mo-
cloud-based ar: Design, implementation and experience, bile vision applications, CoRR abs/1704.04861.
na
in: Proceedings of the 8th ACM on Multimedia Systems [67] C. Tai, T. Xiao, X. Wang, W. E, Convolutional neural
Conference, MMSys’17, ACM, New York, NY, USA, networks with low-rank regularization, in: 4th Interna-
2017, pp. 321–330. tional Conference on Learning Representations, ICLR
[51] Mobile-edge computing introductory technical white pa- 2016,San Juan, Puerto Rico, May 2-4, 2016, Conference
ur
offloading for mobile cloud computing in dense wire- Tang, H. Li, Y. Chen, P. Dubey, Faster cnns with direct
less networks, IEEE Transactions on Mobile Computing sparse convolutions and guided pruning.
PP (99) (2016) 1–1. [69] C. Bucila, R. Caruana, A. Niculescu-Mizil, Model com-
[53] Y. Liu, M. J. Lee, Y. Zheng, Adaptive multi-resource al- pression, in: Proceedings of the Twelfth ACM SIGKDD
location for cloudlet-based mobile cloud computing sys- International Conference on Knowledge Discovery and
tem, IEEE Transactions on Mobile Computing 15 (10) Data Mining, Philadelphia, PA, USA, August 20-23,
(2016) 2398–2410. 2006, 2006, pp. 535–541.
[54] J. Zheng, C. Yueming, W. Yuan, S. X. Sherman, Dy- [70] G. E. Hinton, O. Vinyals, J. Dean, Distilling the knowl-
namic computation offloading for mobile cloud comput- edge in a neural network, CoRR abs/1503.02531.
ing: A stochastic game-theoretic approach, IEEE Trans- [71] T. Chen, L. Lin, W. Zuo, X. Luo, L. Zhang, Learning a
actions on Mobile Computing PP (99) (2018) 1–1. wavelet-like auto-encoder to accelerate deep neural net-
[55] J. Xu, L. C. P. Zhou, Joint service caching and task works, in: Proceedings of the Thirty-Second AAAI Con-
offloading for mobile edge computing in dense net- ference on Artificial Intelligence,(AAAI-18), the 30th
works, in: IEEE INFOCOM 2018 - IEEE Confer- innovative Applications of Artificial Intelligence(IAAI-
ence on Computer Communications, 2018, pp. 207–215. 18), and the 8th AAAI Symposium on Educational Ad-
doi:10.1109/INFOCOM.2018.8485977. vances in Artificial Intelligence (EAAI-18), New Or-
[56] Q. Liu, S. Huang, J. Opadere, T. Han, An edge network leans, Louisiana, USA, February 2-7, 2018, 2018, pp.
orchestrator for mobile augmented reality, in: 2018 IEEE 6722–6729.
Conference on Computer Communications, INFOCOM [72] M. Tan, B. Chen, R. Pang, V. Vasudevan, M. Sandler,
2018, Honolulu, HI, USA, April 16-19, 2018, 2018, pp. A. Howard, Q. V. Le, Mnasnet: Platform-aware neural
756–764. architecture search for mobile, in: IEEE Conference on
[57] M. Chen, Y. Hao, Y. Li, C.-F. Lai, D. Wu, On the com- Computer Vision and Pattern Recognition, CVPR 2019,
putation offloading at ad hoc cloudlet: architecture and Long Beach, CA, USA, June 16-20, 2019, 2019, pp.
service modes, IEEE Communications Magazine 53 (6) 2820–2828.
(2015) 18–24. [73] P. Zhang, E. Lo, B. Lu, High performance depthwise
28 Wang Yingchun, et al.
and pointwise convolutions on mobile devices, in: The heterogeneity-aware coded cooperative computation at
Thirty-Fourth AAAI Conference on Artificial Intelli- the edge, in: 2018 IEEE 26th International Conference
gence, AAAI 2020, The Thirty-Second Innovative Ap- on Network Protocols (ICNP), 2018.
plications of Artificial Intelligence Conference, IAAI [86] X. Ge, S. Tu, G. Mao, C.-X. Wang, T. Han, 5g ultra-
2020, The Tenth AAAI Symposium on Educational Ad- dense cellular networks, IEEE Wireless Communica-
vances in Artificial Intelligence, EAAI 2020, New York, tions 23 (1) (2016) 72–79.
NY, USA, February 7-12, 2020, AAAI Press, 2020, pp. [87] K. Poularakis, J. Llorca, A. M. Tulino, I. Taylor, L. Tas-
6795–6802. siulas, Joint service placement and request routing in
[74] A. Tveit, T. Morland, T. B. Røst, Deeplearningkit - multi-cell mobile edge computing networks, in: 2019
an gpu optimized deep learning framework for apple’s IEEE Conference on Computer Communications, INFO-
ios, os x and tvos developed in metal and swift, ArXiv COM 2019,Paris, France, April 29 - May 2, 2019, 2019,
abs/1605.04614. pp. 10–18.
[75] N. D. Lane, S. Bhattacharya, P. Georgiev, C. For- [88] M. Chen, Y. Hao, Task offloading for mobile edge com-
livesi, L. Jiao, L. Qendro, F. Kawsar, Deepx: puting in software defined ultra-dense network, IEEE
A software accelerator for low-power deep learn- Journal on Selected Areas in Communications 36 (3)
ing inference on mobile devices, in: 2016 15th (2018) 587–597.
ACM/IEEE International Conference on Information [89] F. Zafari, J. Li, K. K. Leung, D. Towsley, A. Swami,
of
Processing in Sensor Networks (IPSN), 2016, pp. 1–12. A game-theoretic approach to multi-objective resource
doi:10.1109/IPSN.2016.7460664. sharing and allocation in mobile edge clouds, CoRR
[76] BaiDu, Paddle model, https://github.com/ abs/1808.06937.
ro
PaddlePaddle/models. [90] X. Ran, H. Chen, X. Zhu, Z. Liu, J. Chen, Deepdecision:
[77] N. Makris, V. Passas, T. Korakis, L. Tassiulas, Employ- A mobile deep learning framework for edge video an-
ing mec in the cloud-ran: An experimental analysis, in: alytics, in: IEEE INFOCOM 2018 - IEEE Conference
Proceedings of the 2018 on Technologies for the Wire-
less Edge Workshop,EdgeTech@MobiCom 2018, New
Delhi, India, November 2, 2018, 2018, pp. 15–19.
-p
[91]
on Computer Communications, 2018, pp. 1421–1429.
doi:10.1109/INFOCOM.2018.8485905.
X. Ran, H. Chen, Z. Liu, J. Chen, Delivering deep learn-
re
[78] W. Jiang, G. Feng, S. Qin, T. S. P. Yum, Efficient ing to mobile devices via offloading, in: Proceedings of
d2d content caching using multi-agent reinforcement the Workshop on Virtual Reality and Augmented Real-
learning, in: IEEE INFOCOM 2018-IEEE Conference ity Network, VR/AR Network’17, ACM, New York, NY,
lP
local execution of multiple deep vision models using on Information Theory (ISIT), 2016, pp. 1451–1455.
wearable commodity hardware, in: Proceedings of the doi:10.1109/ISIT.2016.7541539.
15th Annual International Conference on Mobile Sys- [93] Z. Chen, W. Hu, J. Wang, S. Zhao, B. Amos, G. Wu,
tems, Applications, and Services, MobiSys ’17, ACM, K. Ha, K. Elgazzar, P. Pillai, R. Klatzky, D. Siewiorek,
ur
New York, NY, USA, 2017, pp. 68–81. M. Satyanarayanan, An empirical study of latency in an
[80] S. Sundar, B. Liang, Offloading dependent tasks with emerging class of edge computing applications for wear-
communication delay and deadline constraint, in: 2018 able cognitive assistance, in: Proceedings of the Second
Jo
IEEE Conference on Computer Communications, IN- ACM/IEEE Symposium on Edge Computing, SEC ’17,
FOCOM 2018, Honolulu, HI, USA, April 16-19, 2018, ACM, New York, NY, USA, 2017, pp. 14:1–14:14.
2018, pp. 37–45. [94] M. Kamoun, W. Labidi, M. Sarkiss, Joint resource al-
[81] H. T. Dinh, C. Lee, D. Niyato, W. Ping, A survey of mo- location and offloading strategies in cloud enabled cel-
bile cloud computing: architecture, applications, and ap- lular networks, in: 2015 IEEE International Confer-
proaches, Wireless Communications and Mobile Com- ence on Communications (ICC), 2015, pp. 5529–5534.
puting 13 (18) (2013) 1587–1611. doi:10.1109/ICC.2015.7249203.
[82] D. V. Le, C. K. Tham, A deep reinforcement learn- [95] W. Labidi, M. Sarkiss, M. Kamoun, Energy-optimal re-
ing based offloading scheme in ad-hoc mobile clouds, source scheduling and computation offloading in small
in: IEEE INFOCOM 2018 - IEEE Conference on Com- cell networks, in: 2015 22nd International Confer-
puter Communications Workshops, INFOCOM Work- ence on Telecommunications (ICT), 2015, pp. 313–318.
shops 2018, Honolulu, HI, USA, April 15-19,2018, doi:10.1109/ICT.2015.7124703.
2018, pp. 760–765. [96] X. Chen, L. Jiao, W. Li, X. Fu, Efficient multi-user com-
[83] S. Teerapittayanon, B. McDanel, H. T. Kung, Distributed putation offloading for mobile-edge cloud computing,
deep neural networks over the cloud, the edge and end IEEE/ACM Transactions on Networking 24 (5) (2016)
devices, in: 37th IEEE International Conference on Dis- 2795–2808. doi:10.1109/TNET.2015.2487344.
tributed Computing Systems,ICDCS 2017, Atlanta, GA, [97] H. Flores, P. Hui, P. Nurmi, E. Lagerspetz, S. Tarkoma,
USA, June 5-8, 2017, 2017, pp. 328–339. J. Manner, V. Kostakos, Y. Li, X. Su, Evidence-
[84] V. B. C. Souza, W. Ramírez, X. Masip-Bruin, E. Marín- aware mobile computational offloading, IEEE Transac-
Tordera, G. Ren, G. Tashakor, Handling service al- tions on Mobile Computing 17 (8) (2018) 1834–1850.
location in combined fog-cloud scenarios, in: 2016 doi:10.1109/TMC.2017.2777491.
IEEE International Conference on Communications, [98] W. R. KhudaBukhsh, B. Alt, S. Kar, A. Rizk, H. Koeppl,
ICC 2016,Kuala Lumpur, Malaysia, May 22-27, 2016, Collaborative uploading in heterogeneous networks: Op-
2016, pp. 1–5. timal and adaptive strategies, in: 2018 IEEE Conference
[85] Y. Keshtkarjahromi, Y. Xing, H. Seferoglu, Dynamic on Computer Communications, INFOCOM 2018,Hon-
A survey on deploying mobile deep learning applications: a systemic and technical perspective 29
olulu, HI, USA, April 16-19, 2018, 2018, pp. 1–9. Computer Communications, IEEE, 2019, pp. 82–90.
[99] X. Wang, M. Chen, T. Taleb, A. Ksentini, V. C. M. Le- [112] D. Liu, B. Chen, C. Yang, A. F. Molisch, Caching at the
ung, Cache in the air: exploiting content caching and de- wireless edge: design aspects, challenges, and future di-
livery techniques for 5g systems, IEEE Communications rections, IEEE Communications Magazine 54 (9) (2016)
Magazine 52 (2) (2014) 131–139. 22–28.
[100] L. Wang, L. Jiao, T. He, J. Li, M. Mühlhäuser, Service [113] A. Tatar, M. D. D. Amorim, S. Fdida, P. Antoniadis, A
entity placement for social virtual reality applications in survey on predicting the popularity of web content, Jour-
edge computing, in: 2018 IEEE Conference on Com- nal of Internet Services and Applications 5 (1) (2014) 8.
puter Communications, INFOCOM 2018,Honolulu, HI, [114] C. Bernardini, T. Silverston, F. Olivier, Mpc:popularity-
USA, April 16-19, 2018, 2018, pp. 468–476. based caching strategy for content centric networks, in:
[101] J. Su, D. V. Vargas, K. Sakurai, One pixel attack for fool- Proceedings of IEEE International Conference on Com-
ing deep neural networks, IEEE Transactions on Evolu- munications,ICC 2013, Budapest, Hungary, June 9-13,
tionary Computation(2019). 2013, 2013, pp. 3619–3623.
[102] K.-H. K. Xiufeng Xie, Source compression with [115] J. Li, S. Shakkottai, J. C. S. Lui, V. Subramanian, Ac-
bounded dnn perception loss for iot edge computer curate learning or fast mixing? dynamic adaptability of
vision, in: Proceedings of the 25th Annual Interna- caching algorithms, IEEE Journal on Selected Areas in
tional Conference on Mobile Computing and Network- Communications 36 (6) (2018) 1314–1330.
of
ing(2019), MobiCom ’19, ACM, 2019. [116] S. Misra, N. Saha, Detour: Dynamic task offloading in
[103] Y. Zhang, X. Jiang, Y. Wang, K. Lei, Cache and delivery software-defined fog for iot applications, IEEE Journal
of vr video over named data networking, in: IEEE IN- on Selected Areas in Communications 37 (5) (2019) 1–
ro
FOCOM 2018 - IEEE Conference on Computer Com- 1.
munications Workshops, INFOCOM Workshops 2018, [117] K. Poularakis, J. Llorca, A. M. Tulino, I. Taylor,
Honolulu, HI, USA, April 15-19,2018, 2018, pp. 280– L. Tassiulas, Joint service placement and request routing
285.
[104] H. Hao, C. Xu, M. Wang, H. Xie, Y. Liu, D. O.
Wu, Knowledge-centric proactive edge caching over mo-
-p in multi-cell mobile edge computing networks, CoRR
abs/1901.08946.
[118] H. Gong, K. Xing, W. Du, A user activity pattern mining
re
bile content distribution network, in: IEEE INFOCOM system based on human activity recognition and location
2018 - IEEE Conference on Computer Communications service, in: IEEE INFOCOM 2018 - IEEE Conference
Workshops, INFOCOM Workshops 2018, Honolulu, HI, on Computer Communications Workshops, INFOCOM
lP
USA, April 15-19,2018, 2018, pp. 450–455. Workshops 2018, Honolulu, HI, USA, April 15-19,2018,
[105] N. Mohan, P. Zhou, K. Govindaraj, J. Kangasharju, 2018, pp. 1–2.
Managing data in computational edge clouds, in: Pro- [119] H. Zhang, A. Wang, D. Li, W. Xu, Deepvoice: A
ceedings of the Workshop on Mobile Edge Communica- voiceprint-based mobile health framework for parkin-
na
tions, MECOMM@SIGCOMM 2017, Los Angeles, CA, son’s disease identification, in: 2018 IEEE EMBS Inter-
USA, August 21, 2017, 2017, pp. 19–24. national Conference on Biomedical & Health Informat-
[106] J. Gu, W. Wang, A. Huang, H. Shan, Proactive stor- ics, BHI 2018, Las Vegas, NV, USA, March 4-7, 2018,
age at caching-enable base stations in cellular networks, 2018, pp. 214–217.
ur
in: 24th IEEE Annual International Symposium on [120] A. Stisen, H. Blunck, S. Bhattacharya, T. S. Prentow,
Personal, Indoor, and Mobile Radio Communications, M. B. Kjærgaard, A. K. Dey, T. Sonne, M. M. Jensen,
PIMRC 2013, London, United Kingdom,September 8- Smart devices are different: Assessing and mitigating-
Jo
11, 2013, 2013, pp. 1543–1547. mobile sensing heterogeneities for activity recognition,
[107] F. Wang, F. Wang, J. Liu, R. Shea, L. Sun, Intelligent in: Proceedings of the 13th ACM Conference on Embed-
video caching at network edge: A multi-agent deep re- ded Networked Sensor Systems, SenSys 2015, Seoul,
inforcement learning approach, in: IEEE INFOCOM South Korea, November 1-4, 2015, 2015, pp. 127–140.
2020 - IEEE Conference on Computer Communications, [121] S. Yao, Y. Zhao, H. Shao, D. Liu, S. Liu, Y. Hao, A. Piao,
2020. S. Hu, S. Lu, T. F. Abdelzaher, Sadeepsense: Self-
[108] A. Raman, N. Sastry, A. Sathiaseelan, J. Chandaria, attention deep learning framework for heterogeneous on-
A. Secker, Wi-stitch: Content delivery in converged edge device sensors in internet of things applications, in: 2019
networks, in: Proceedings of the Workshop on Mobile IEEE Conference on Computer Communications, INFO-
Edge Communications, MECOMM@SIGCOMM 2017, COM 2019, Paris, France, April 29 - May 2, 2019, 2019,
Los Angeles, CA, USA, August 21, 2017, 2017, pp. 13– pp. 1243–1251.
18. [122] F. Boccardi, R. W. Heath, A. Lozano, T. L. Marzetta,
[109] A. Raman, N. Sastry, N. Mokari, M. Salehi, T. Faisal, P. Popovski, Five disruptive technology directions for 5g,
A. Secker, J. Chandaria, Care to share?: An empirical IEEE Communications Magazine 52 (2) (2014) 74–80.
analysis of capacity enhancement by sharing at the edge, [123] M. Cheung, J. She, L. Liu, Deep learning-based on-
in: Proceedings of the 2018 on Technologies for the line counterfeit-seller detection, in: IEEE INFOCOM
Wireless Edge Workshop, EdgeTech@MobiCom 2018, 2018 - IEEE Conference on Computer Communications
New Delhi, India, November 2, 2018, 2018, pp. 27–31. Workshops, INFOCOM Workshops 2018, Honolulu, HI,
[110] A. Mete, S. Moharir, Caching policies for d2d-assisted USA, April 15-19, 2018, 2018, pp. 51–56.
content delivery systems, in: Proceedings of the 2018 [124] W. Chang, Y. Yu, J. Chen, Z. Zhang, S. Ko, T. Yang,
on Technologies for the Wireless Edge Workshop, ACM, C. Hsu, L. Chen, M. Chen, A deep learning based wear-
2018, pp. 3–7. able medicines recognition system for visually impaired
[111] X. Zhang, Q. Zhu, D2d offloading for statistical qos people, in: IEEE International Conference on Artifi-
provisionings over 5g multimedia mobile wireless net- cial Intelligence Circuits and Systems, AICAS 2019,
works, in: IEEE INFOCOM 2019-IEEE Conference on Hsinchu, Taiwan, March 18-20, 2019, 2019, pp. 207–
30 Wang Yingchun, et al.
208.
[125] C. Hodges, S. An, H. Rahmani, M. Bennamoun, Deep
learning for driverless vehicles, in: Handbook of Deep
Learning Applications, 2019, pp. 83–99.
of
ro
-p
re
lP
na
ur
Jo
Conflict of interest
The authors declared that they have no conflicts of interest to this work.
We declare that we do not have any commercial or associative interest that represents a conflict of
interest in connection with the work submitted.
of
ro
-p
re
lP
na
ur
Jo