2023-An Empirical Study of Federated Learning On Iot-Edge Devices - Resource Allocation and Heterogeneity
2023-An Empirical Study of Federated Learning On Iot-Edge Devices - Resource Allocation and Heterogeneity
X, X 2023 1
These challenges make FL impractical and limit the motivation as specified in RQ2. Additionally, we aim to find the
of parties to join the federation for training. dominant factor towards the behaviors of FL in a real-world
Despite the aforementioned real-world issues, most existing deployment.
studies on FL heavily rely on simulation settings or small-scale To answer these questions, we need stable FL systems
testbeds of devices [8, 9, 10] to examine the behavior of their that can be deployed our targeted hardware, i.e., Raspberry
systems. While simulation settings are useful for controlled Pi 3 (Pi3), Raspberry Pi 4 (Pi4), Jetson Nano (Nano) and
testing and development of FL models, they face significant Jetson TX2 (TX2) and can support GPUs on edge computing
challenges in adequately covering all operational aspects of real- boards. While many algorithms are accompanied by source
world deployments. Specifically, existing simulators cannot em- code, only Federated Averaging (FedAvg) [5] can satisfy our
ulate crucial aspects of realistic execution environments, such as requirements due to its popularity. FedAvg has been extensively
resource consumption (e.g., memory, CPU/GPU usage, battery studied and evaluated in the literature with a large number of
life) and network connectivity (e.g., bandwidth and network works reporting its performance characteristics and limitations
congestion). These factors significantly impact the performance in simulations. However, understanding its behavior on real
of FL systems, as demonstrated in Section IV. Additionally, devices is still limited (c.f. Secion II. Hence, we will focus
other realistic environment aspects such as data distribution, on FedAvg for our studies in this paper and leave others for
underlying software libraries, and executing settings introduce future work. However, our experiment design in Section III
further challenges that can affect FL performance. Therefore, is general enough to be replicated in other algorithms, given
this motivates us to conduct more comprehensive evaluations that their implementations are stable enough to run on targeted
of such aspects to ensure their effectiveness and scalability. devices.
In Section II, we observe a lack of experimental studies
that systematically investigate the implementation of FL on
real devices and assess the impact of intrinsic heterogeneity B. Our Key Findings
on performance and costs. Although there have been some
Along this light, our extensive set of experiments reported
attempts to implement FL on IoT-Edge devices at small scales
in Section IV reveal the following key findings:
with simplistic settings, it is desirable to have more reproducible
experiments in larger and more realistic settings. Hence, to the • The on-device settings can achieve similar training accuracy
best of our knowledge, our study pushed the experiment scale to the simulation counterparts with similar convergence
and complexity to a new level. behaviors. But when it comes to operational behaviours
related to computation and communication, the on-device
ones show much more complicated behavior patterns for
A. Objectives, Research Questions and Scope realistic IoT-Edge deployments.
To identify potential issues and limitations on real devices • The disparity in computational and networking resources
that may not be apparent in simulated environments, we among the participating devices leads to longer model update
focus our study on the impact of resource allocations and (local and global) exchange times because high computational
heterogeneity independently and their combined effects in devices need to wait for the server to receive and aggregate
realistic environments. To achieve this, we focus on the local updates from low computational devices. This hints that
following research questions (RQ): an oversimplified emulation of these aspects in simulation
• RQ1: What are the behaviors of FL implementation in setting highly likely lead to unexpected outcomes of a FL
realistic environments compared to a simulation setting? algorithm at the deployment phase.
In this RQ, we compare many simulation and on-device • Data heterogeneity is the most dominant factor in FL
deployment aspects. We want to see how simulation results performance, followed by the number of clients. The per-
can represent reality because FL experiments conducted in formance of the global model is affected most by the data
a controlled laboratory setting may not accurately reflect distribution (i.e., Non-IID and Extreme Non-IID) of each
the challenges and complexities of realistic device-based participating client, especially for challenging learning tasks.
environments. Hence, combining with the disparity in computational and
• RQ2: How do resource allocation and heterogeneity networking resources, FL on diverse IoT-Edge devices in
affect the learning performance and operation costs? realistic deployment settings need further understanding on-
There are several factors that can affect FL deployment. This device behaviours in terms combining all these factors in
RQ focuses on the client participation rate, communication tandem.
bandwidth, device and data heterogeneity. We test each factor
independently to learn their impact on the behaviors of FL.
C. Paper Outline
Specifically, we want to observe the impact of varying the
number and type of devices, bandwidth, and data distribution The rest of this article is organized as follows. Section II
on the FL process for each factor. presents preliminaries to our work and discusses some existing
• RQ3: How do these two factors, resource allocation surveys and empirical studies on FL. In Section III, we show
and heterogeneity, simultaneously affect the learning our experimental designs and followed by our results and
performance and operation costs? This RQ is an essential findings in Section IV. Finally, we give further discussions in
study on understanding the impact of combined factors Section V and conclude this empirical study in Section VI.
IEEE INTERNET OF THINGS JOURNAL, VOL. X, NO. X, X 2023 3
II. P RELIMINARIES AND R ELATED W ORKS al. [15] explore and analyze the potential of FL for enabling
A. Federated Learning a wide range of IoT services, including IoT data sharing,
data offloading and caching, attack detection, localization,
In the standard FL framework, data for learning tasks is
mobile crowdsensing, and IoT privacy and security. Ahmed
acquired and processed locally at the IoT-Edge nodes, and only
et al. [16] discuss the implementation challenges and issues
the trained model parameters are transmitted to the central
when applying FL to an IoT environment. Zhu et al. [17]
server for aggregation. In general, along with an initialization
provides a detailed analysis of the influence of Non-IID data
stage, FL involves the following stages:
on different types of ML models in both horizontal and
• Stage 0 (Initialization): The aggregation server S first initiates
vertical FL. Li et al. [18] conduct extensive experiments to
the weight w0 of the global model and hyperparameters such evaluate state-of-the-art FL algorithms on Non-IID data silos
as the number of communication rounds T , size of the and find that Non-IID does bring significant challenges in
selected clients for each round N , and local training details. learning accuracy of FL algorithms, and none of the existing
• Stage 1 (Client training): All selected clients C1 , C2 , C3 , ...,
state-of-the-art FL algorithms outperforms others in all cases.
CN receive the current global weight from S. Next, each Ci Recently, Matsuda et al. [19] benchmark the performance of
updates its local model parameters wit using its local dataset existing personalized FL through comprehensive experiments
Di , where t denotes the current communication round. Upon to evaluate the characteristics of each method and find that
the completion of the local training, all selected clients send there are no champion methods. Caldas et al. [20] propose
the local weight to S for model aggregation. LEAF, a modular benchmarking simulation-based framework
• Stage 2 (Model Aggregation): S aggregates the received local
for learning in federated settings. LEAF includes a suite of open-
weights based on a certain mechanism and then sends back source federated datasets, a rigorous evaluation framework,
the aggregated weights to the clients for the next round of and a set of reference implementations. To the best of our
local training. knowledge, we are the first ones that consider an empirical
study of FL on IoT-Edge devices.
B. Federated Averaging Algorithm For real-world FL implementation, Di et al. [8] present
Federated Averaging (FedAvg) is the de facto FL algorithm FedAdapt, an adaptive offloading FL framework based on
that is included in most FL systems [5]. As shown in Algorithm reinforcement learning and clustering to identify which layers
1, FedAvg aggregates the locally trained model parameters by of the DNN should be offloaded for each device onto a server.
weighted averaging proportional to the amount of local dataset Experiments are carried out on a lab-based testbed, including
Di , that each client Ci had (corresponding to the above Stage two Pi3s, two Pi4s, and one Jetson Xavier. Sun et al. [9] propose
2). Note that there are many advanced FL algorithms were a model selection and adaptation system for FL (FedMSA),
introduced (e.g., FedProx [11] and FedMA [12]), with different which includes a hardware-aware model selection algorithm,
purposes in the last few years [13, 14]. then demonstrate the effectiveness of their method on a network
of two Pi4s and five Nanos. Mills et al. [10] propose adapting
Algorithm 1 FedAvg Algorithm [5]. FedAvg to use a distributed form of Adam optimization, then
1: Aggregation Server executes: test their method on a small testbed of five Pi2s and five Pi3s.
2: initialize: w ← w0 Furthermore, Zhang et al. [21] build the FedIoT platform for
3: for each round t = 1, 2, 3, . . . , T do on-device anomaly data detection and evaluate their platform
4: for each client i = 1, 2, 3, . . . , N in parallel do on a network of ten Pi4s. However, these attempts are still on
5: t
wi ← w t−1 a small scale and do not represent real-world environments.
t t
6: wi ← ClientTraining(wi , Di )
TABLE I
7: end for C OMPARISON BETWEEN OUR WORK AND OTHERS .
8: // ModelAggregation
PN
wt+1 ← PN1 n t Empirical Simulation Device-based
9: i=1 ni wi
i=1 i
10: end for Studies -based Small-scale Large-scale Device
11: return: w T (up to 10) (up to 64) Heter.*
12: [18, 19, 20] ✓ ✗ ✗ ✗
13: ClientTraining(wi , Di ): // Run on client Ci [21, 9, 10] ✓ ✓ ✗ ✗
14: for each epoch e = 1, 2, 3, . . . , E do Ours ✓ ✓ ✓ ✓
15: wi ← wi − η∇l(wi ; Di ) *
Device Heterogeneity: Study of different types of IoT devices
16: end for
17: return: wi
III. E XPERIMENTAL D ESIGN
This section describes how we designed our experiments to
C. Related Works answer our research questions in Section I-A. Starting with data
Several available theoretical surveys and simulation-based preparation, we then implement FL on IoT-Edge devices with
empirical studies on FL are available in the literature. Dinh et different settings based on the evaluation factors we defined.
IEEE INTERNET OF THINGS JOURNAL, VOL. X, NO. X, X 2023 4
After that, we use a bag of metrics to analyze the impact of the model does not need massive resources for training, making
these factors individually and their combined effects in different it suitable for deployment on IoT-Edge devices.
aspects. Fig. 2 illustrates this workflow in detail.
B. Hardware and Software Specifications
A. Data Preparation and Models In the past few years, many IoT-Edge devices have entered
1) Datasets: We use two datasets in this work : CIFAR10 the market with different prices and abilities. In this work,
[22] and CIFAR100 [22], which are commonly used in previous we use the most popular ones such as Pi3, Pi4, Nano, and
studies on FL [11, 18]. CIFAR10 consists of 60000 32x32 color TX2. Different types of devices with different generations
images and is the simple one. The images are labeled with one have different resources and processing capabilities. A diverse
of 10 exclusive classes. There are 6000 images per class with pool of devices helps us more accurately represent the real
5000 training and 1000 testing images. CIFAR100 also consists world. Our devices are connected to a workstation, which is
of 60000 32x32 color images and is more challenging to train, used as the server, via a network of IoT-Edge devices and
however, each image comes with one of 100 fine-grained labels. switches. Fig. 4 is a snapshot of our infrastructure. In more
There are 600 images per class with 500 training and 100 testing detail, Table II provides specifications of these devices, and
images. the server machine and simulation machine are also described.
2) Data Partitioning: The CIFAR10 and CIFAR100 datasets For software specifications, we use the PyTorch [24] frame-
are not separated for FL originally, we need to divide these two work version 1.13.1 to implement deep learning components
datasets synthetically. While the test sets are kept at the server and use the Flower [25] framework version 1.11.0 FedAvg
for testing the aggregated model, we divide the training set of algorithm. Additionally, we use Docker technology to create a
each dataset into 64 disjoint partitions with an equal number separate container on each device to perform local training.
of samples in three different ways to simulate three scenarios
of heterogeneity that are IID, Non-IID, and Extreme Non-IID C. Evaluation Metrics
(ExNon-IID). The IID strategy adapts independent and random
In this study, we use a comprehensive set of metrics to
division, as shown in Fig. 3(a) and 3(b), the data distribution in
characterize and quantify the impact of heterogeneity factors
each client is basically the same. The Non-IID and ExNon-IID
on the behaviors of FL implementation in realistic environments.
strategies use biased divisions proposed in [5, 23]. Specifically,
Specifically, test accuracy and convergence speed are used to
the whole dataset is sorted according to the labels and divided
evaluate the learning performance. Averaged training time,
into different chunks, then these chunks are randomly assigned
memory, and GPU/CPU utilization are used to measure
to different clients. The number of chunks affects the degree of
computational costs. Finally, we use the averaged model update
heterogeneity across clients. As shown in Fig. 3(c)-(f), while
(local and global) exchange time between the clients and the
each client in Non-IID contains approximately four and ten
aggregation server to measure the communication cost. Table
data classes in CIFAR10 and CIFAR100, respectively, each
III provides concise definitions of all our used metrics.
client in ExNon-IID contains only one and two data classes
in CIFAR10 and CIFAR100 respectively, which simulates the
extreme data heterogeneity across clients. D. Experiments Setup
3) Model Architecture: Following previous works [5, 20], we 1) Behaviors of On-Device FL Implementation (RQ1): First
study a popular CNN model designed for image classification of all, we conduct a baseline experiment on the simulation.
tasks, called CNN3 on the two datasets. The model only Particularly, we simulate eight clients in which each client holds
includes two 5x5 convolution layers (the first with 32 channels, one of the first eight partitions (12.5 % of total partitions) in
the second with 64), each followed by a ReLU activation the CIFAR10 IID dataset. For the training settings, we train a
function and a 2x2 max pooling. After that, one fully connected simple CNN3 model described above for 500 communication
layer with 512 units and ReLu activation is added, followed by rounds, at each round, the model is trained for 2 local epochs
a softmax layer as a classifier. The number of output units is 10 at the clients, SGD optimizer is used with a learning rate
for CIFAR10 and 100 for CIFAR100. By its simple architecture, of 0.01, and the batch size is set to 16. To answer the
IEEE INTERNET OF THINGS JOURNAL, VOL. X, NO. X, X 2023 5
Fig. 3. Data distribution of the first 24 clients in the CIFAR10 and CIFAR100 datasets.
TABLE II
H ARDWARE S PECIFICATIONS .
RQ1 described in Section I-A, we then turn the simulation impact of these factors, we conduct extensive experiments that
environment in the above experiment into realistic environments are shown in detail in Fig. 5. Training settings are the same as
by sequentially using eight Pi3s, eight Pi4s, and eight Nanos in the baseline experiment in RQ1. By conducting experiments
as clients. These devices are connected to a server machine defined in Fig. 5, we can observe what happens when the
via ethernet connections. For comparison, all training settings number of participating clients increases, the communication
are maintained as in the baseline. We use all metrics defined bandwidth is saturated, and when intrinsic heterogeneity is
in Table III to describe the behaviors of FL implementation. introduced across clients. The results and conclusions for RQ2
The results and conclusions are shown in Section IV-A. experiments are provided in Section IV-B.
2) Impact of Single Factor (RQ2): For the RQ2, we 3) Impact of Combined Factors (RQ3): After observing the
consider two critical factors in FL, namely resource allocation impact of resource allocation and heterogeneity individually
and heterogeneity. Resource allocation includes the number by addressing RQ2, we aim to explore more realistic scenarios
of participating clients and the connection’s communication where these two factors appear simultaneously. First, we vary
bandwidth, and heterogeneity includes device heterogeneity and the number of participating clients and increase the degree
data heterogeneity (statistical heterogeneity). To explore the of heterogeneity in client devices concurrently. Second, we
IEEE INTERNET OF THINGS JOURNAL, VOL. X, NO. X, X 2023 6
TABLE III
E VALUATION M ETRIC D EFINITIONS .
Fig. 5. Experiments Setup for Studying the Impact of Single Factor (RQ2).
Fig. 6. Experiments Setup for Studying the Impact of Combined Factors (RQ3).
Impact of the Communication Bandwidth. Next, we that update exchange time increase linearly when we decrease
investigate the effect of connection bandwidth on update the bandwidth. Specifically, when we halve the bandwidth
exchange time. One interesting point obtained from Fig. 9 is from 100Mbps to 50Mbps, the update exchange time increases
IEEE INTERNET OF THINGS JOURNAL, VOL. X, NO. X, X 2023 8
TABLE IV
B EHAVIORS OF O N -D EVICE FL I MPLEMENTATION (RQ1).
TABLE V
I MPACT OF THE D EVICE H ETEROGENEITY.
(i.e., from 32 to 64), the improvement is not significant but the reduce the congestion in communication. These observations
update exchange time goes up dramatically. Moreover, the data also suggest that a large number of clients and the congestion
heterogeneity also affects the global model’s accuracy signifi- have a significantly negative effect on the update exchange time
cantly, especially in ExNon-IID cases. Besides heterogeneity and raise a need for novel FL algorithms capable of handling
in labels of local datasets, other types of data heterogeneity situations with massive clients.
such as quantity heterogeneity or distribution heterogeneity
are also important and might degrade the model’s accuracy
much further, however, these types of data heterogeneity are
still under-explored. In addition, the update exchange time
is linearly affected by communication bandwidth. Also, we
show that better client selection strategies are essential when
dealing with heterogeneous devices to leverage the presence
of high-end devices and reduce the update exchange time.
However, it is quite challenging on a real deployment when
the distributions of computing power and data are not known
as a prior and can not be simulated in a controlled setting.
Fig. 12. Combined Impact of the Number of Clients and Data Heterogeneity.
In summary, we have figured out that the communication In this study, we observed that the practicality of FL on
congestion caused by a large number of clients has a signif- IoT-Edge devices depends on combined effects from various
icant negative effect on the update exchange time. However, factors such as device availability (number of participating
increasing the number of clients leads to improvements in clients), communication constraints (bandwidth availability),
accuracy, especially in heterogenous data scenarios. Also, and heterogeneity of data (data distribution) and devices
data heterogeneity is the most dominant factor that affects (computational capability and hardware configuration). These
the model’s test accuracy, especially in challenging datasets. factors are interdependent and affect each other, and hence, a
Going beyond the fundamental image classification task, data comprehensive analysis of the practicality of FL on IoT devices
heterogeneity might further hurt the model’s performance in should consider all these factors together. For example, the
other advanced tasks, such as object detection or segmentation, computational capability of devices can affect communication
which are under-explored in current literature. Interestingly, overhead, as devices with lower computational capability may
we also observe that some homogeneous devices can behave take longer to process and transmit data, resulting in higher
differently. This may be caused by various implicit factors such communication latency and overhead. Similarly, the hetero-
as power supply, network conditions, hardware and software geneity of devices can affect the robustness of FL algorithms,
variations, or user behavior. as the presence of devices with varying characteristics can
introduce heterogeneity in the data and make it challenging to
V. D ISCUSSIONS train accurate models.
In this section, we first discuss the practicality of FL on To address the processing power and storage capacity issues,
IoT-Edge devices (based on our experimental results) and then we need to design models that are optimized for lightweight
discuss other essential factors to consider while designing an devices and implement compression or distillation techniques to
FL system for IoT devices. reduce the size of the updates. There is also a need to implement
techniques such as asynchronous updates and checkpointing
A. Practicality of FL on IoT-Edge Devices to ensure that the training process can continue even when
FL requires local processing on the device, which can be devices are disconnected due to network connectivity issues.
challenging on lightweight devices with limited processing
power. In addition, storing the model updates locally can be
B. Other Considerable Factors
challenging due to the limited storage capacity. Another chal-
lenge is the unreliable connectivity of IoT devices. Federated Besides the factors studied in this work, it is essential
learning requires a stable and reliable network connection for to consider other factors that can cause IoT devices not to
devices to communicate with each other and the aggregation perform well in FL, such as the power supply of devices and
server. However, IoT-Edge devices are often deployed in remote specifications of memory cards, and the performance of the
locations with limited network connectivity. aggregation server when designing FL systems.
IEEE INTERNET OF THINGS JOURNAL, VOL. X, NO. X, X 2023 11
1) Power Supply: The amount of power available to the FL in IoT-Edge devices. Lastly, the study may miss out on the
device can impact its processing capability. If the device has a potential benefits of other FL algorithms that are better suited
limited power supply, it may not be able to perform complex for specific scenarios or applications. For instance, FedProx
computations or transmit large amounts of data efficiently. [11] is designed to handle heterogeneity in data across devices
Furthermore, the quality and reliability of the power supply and can improve the convergence rate of the FL process. It is
can affect the device’s stability and longevity. Power surges or important to note that these future improvements do not affect
outages can cause damage to the device’s components, leading the objectives and scopes of the current study.
to reduced performance and potentially even complete failure. Particularly, we plan to extend our study to a broader range
As shown in [26], when the battery life of the devices decreased, of scenarios by examining the impact of varying network
the accuracy of the global model also decreased significantly. conditions, communication protocols, and resource usage of
Hence, it is crucial to ensure that devices used in FL have FL. In addition, we want to conduct a comprehensive analysis
access to a reliable power supply with sufficient capacity to to measure the resource consumption of FL, including battery
handle the demands of the learning process. life and network bandwidth usage. We also want to focus
2) Memory Card Usage: The speed and capacity of the on real-world applications of FL on IoT devices, including
memory card can indirectly affect the overall performance of developing FL-based solutions for specific IoT use cases
the IoT device itself. If the memory card is slow or has limited such as environmental monitoring, predictive maintenance, and
capacity, it may result in slower data processing and storage, evaluating their performance in realistic environments.
slowing down the overall FL process. Also, the reliability and
durability of the memory card can impact FL performance. R EFERENCES
For instance, if the memory card fails or becomes corrupted, [1] Internet of Things (IoT) And Non-IoT Active
it can result in the loss of data, which can negatively impact Device Connections Worldwide From 2010 to 2025.
the accuracy and effectiveness of the FL model. https://www.statista.com/statistics/1101442/iot-number-
3) Performance of the Aggregation Server: The performance of-connected-devices-worldwide.
of the aggregation server is crucial to the success of the FL [2] Enabling Mass IoT Connectivity as
process and can bring a significant impact on the participating Arm Partners Ship 100 Billion Chips.
IoT devices. The aggregation server needs to have sufficient https://community.arm.com/arm-community-
computational resources to process the incoming model updates blogs/b/internet-of-things-blog/posts/enabling-mass-
from IoT devices. If the server is overloaded, this can cause iot-connectivity-as-arm-partners-ship-100-billion-chips.
delays or even crashes in the system, affecting the IoT devices [3] Christoph Gröger. There Is No AI Without Data. Commun.
involved. This can be particularly problematic if the IoT devices ACM, 64(11):98–108, 2021.
have limited resources themselves, as they may not be able to [4] Axel von dem Bussche Paul Voigt. The EU General Data
handle the increased workload. Protection Regulation (GDPR). Springer Cham, 2017.
[5] H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth
Hampson, and Blaise Agüera y Arcas. Communication-
VI. C ONCLUSIONS AND F UTURE W ORKS
Efficient Learning of Deep Networks From Decentralized
The results of our experiment have revealed several important Data. In Proceedings of the 20th International Conference
findings: (1) our simulation of FL has shown that it can be on Artificial Intelligence and Statistics, Aistats 2017, 2017.
a valuable tool for algorithm testing and evaluation, but its [6] Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong.
effectiveness in accurately representing the reality of IoT-Edge Federated Machine Learning: Concept and Applications.
deployment is very limited, (2) the disparity in computational ACM Trans. on Intelligent Systems and Technology, 2019.
resources among IoT devices can significantly impact the [7] Qinbin Li, Zeyi Wen, Zhaomin Wu, Sixu Hu, Naibo
update exchange time, and (3) data heterogeneity is the most Wang, Yuan Li, Xu Liu, and Bingsheng He. A Survey on
dominant factor in the presence of other factors, especially Federated Learning Systems: Vision, Hype and Reality for
working in tandem with computation and network factors. Data Privacy and Protection. IEEE Trans. on Knowledge
Moving forward, several areas could be explored to expand and Data Engineering, 2021.
on the findings of this study. Firstly, considering the diversity [8] Di Wu, Rehmat Ullah, Paul Harvey, Peter Kilpatrick,
of devices used in FL, it would be valuable to test the approach Ivor Spence, and Blesson Varghese. FedAdapt: Adaptive
on a more comprehensive range of devices with different Offloading for IoT Devices in Federated Learning. IEEE
hardware, operating systems, and network connections to ensure Internet of Things Journal, 9(21):20889–20901, 2022.
the effectiveness and robustness of the approach. Secondly, [9] Rui Sun, Yinhao Li, Tejal Shah, Ringo W. H. Sham,
the dataset selection process used for training the FL model Tomasz Szydlo, Bin Qian, Dhaval Thakker, and Rajiv
could be further optimized to increase accuracy and efficiency Ranjan. FedMSA: Fedmsa: A Model Selection and
and ensure that the results represent all potential use cases. Adaptation System for Federated Learning. Sensors,
Additionally, to expand the scope of the study’s findings, 22(19), 2022.
exploring other FL algorithms beyond the standard FedAvg [10] Jed Mills, Jia Hu, and Geyong Min. Communication-
algorithm could be beneficial. These alternative algorithms Efficient Federated Learning for Wireless Edge Intelli-
could be better suited for specific scenarios or applications and gence in IoT. IEEE Internet of Things Journal, 7(7):5986–
may provide insights into how to improve the performance of 5994, 2020.
IEEE INTERNET OF THINGS JOURNAL, VOL. X, NO. X, X 2023 12
[11] Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Kwing Hei Li, Titouan Parcollet, Pedro Porto Buarque
Sanjabi, Ameet Talwalkar, and Virginia Smith. Federated de Gusmão, et al. Flower: A Friendly Federated Learning
Optimization in Heterogeneous Networks. In I. Dhillon, Research Framework. arXiv:2007.14390, 2020.
D. Papailiopoulos, and V. Sze, editors, Proceedings of [26] Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp,
Machine Learning and Systems, pages 429–450, 2020. Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloe
[12] Hongyi Wang, Mikhail Yurochkin, Yuekai Sun, Dimitris Kiddon, Jakub Konečnỳ, Stefano Mazzocchi, Brendan
Papailiopoulos, and Yasaman Khazaeni. Federated Learn- McMahan, et al. Towards Federated Learning at Scale:
ing with Matched Averaging. In International Conference System Design. Proceedings of Machine Learning and
on Learning Representations, 2020. Systems, 1:374–388, 2019.
[13] Huiming Chen, Huandong Wang, Qingyue Long, Depeng
Jin, and Yong Li. Advancements in Federated Learning: B IOGRAPHY S ECTION
Models, Methods, and Privacy. arXiv, 2023.
[14] Bingyan Liu, Nuoyan Lv, Yuanchun Guo, and Yawen Li.
Recent Advances on Federated Learning: A Systematic
Survey. arXiv, 2023.
[15] Dinh C. Nguyen, Ming Ding, Pubudu N. Pathirana, Aruna Kok-Seng Wong (Member, IEEE) received his first
degree in Computer Science (Software Engineering)
Seneviratne, Jun Li, and H. Vincent Poor. Federated from the University of Malaya, Malaysia in 2002,
Learning for Internet of Things: A Comprehensive Survey. and an M.Sc. (Information Technology) degree from
IEEE Communications Surveys and Tutorials, 23, 2021. Malaysia University of Science and Technology (in
collaboration with MIT) in 2004. He obtained his
[16] Ahmed Imteaj, Urmish Thakker, Shiqiang Wang, Jian Li, Ph.D. from Soongsil University, South Korea, in
and M. Hadi Amini. A Survey on Federated Learning 2012. He is currently an Associate Professor in
for Resource-Constrained IoT Devices. IEEE Internet of the College of Engineering and Computer Science,
VinUniversity. To this end, he conducts research that
Things Journal, 9, 2022. spans areas of security, data privacy, and AI security
[17] Hangyu Zhu, Jinjin Xu, Shiqing Liu, and Yaochu Jin. while maintaining a strong relevance to the privacy-preserving framework.
Federated Learning on Non-iid Data: A Survey. Neuro-
computing, 465:371–390, 2021.
[18] Qinbin Li, Yiqun Diao, Quan Chen, and Bingsheng
He. Federated Learning on Non-IID Data Silos: An Duc-Manh Nguyen got a Master in Information Sci-
Experimental Study. In 2022 IEEE 38th International ence and Technology from University of Information
Conference on Data Engineering (ICDE), 2022. Science and Technology, North Macedonia. Currently,
he is a PhD candidate and a research assistant at
[19] Koji Matsuda, Yuya Sasaki, Chuan Xiao, and Makoto the Technical University of Berlin. His research
Onizuka. An Empirical Study of Personalized Federated focuses on Robotics and Edge Computing with
Learning. arXiv, 2022. Machine Learning, partially Cooperative Perception
for Autonomous Vehicles.
[20] Sebastian Caldas, Sai Meher Karthik Duddu, Peter Wu,
Tian Li, Jakub Konečný, H. Brendan McMahan, Virginia
Smith, and Ameet Talwalkar. LEAF: A Benchmark for
Federated Settings. In Workshop on Federated Learning
for Data Privacy and Confidentiality, 2019.
[21] Tuo Zhang, Chaoyang He, Tianhao Ma, Lei Gao, Mark Khiem Le-Huy got the Honors Bachelor Degree
Ma, and Salman Avestimehr. Federated Learning for in Mathematics and Computer Science from the
Internet of Things. In Proceedings of the 19th ACM Vietnam National University, Ho Chi Minh City.
He was a Research Intern at Smart Health Center,
Conference on Embedded Networked Sensor Systems, VinBigData JSC, and currently is a Research As-
SenSys ’21, page 413–419, New York, NY, USA, 2021. sistant at the College of Engineering and Computer
Association for Computing Machinery. Science, VinUniversity, Hanoi, Vietnam. His research
interests include Efficient Machine Learning and AI
[22] Alex Krizhevsky. Learning Multiple Layers of Features for Biomedical Applications.
from Tiny Images. Science Department, University of
Toronto, Tech, 2009.
[23] Yue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon
Civin, and Vikas Chandra. Federated Learning with Non-
IID Data. arXiv, 2018.
Long Ho-Tuan got the Honors Bachelor Degree
[24] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, in Computer Science from the Vietnam National
James Bradbury, Gregory Chanan, Trevor Killeen, Zeming University, Hanoi, Vietnam. Currently, he is a Re-
Lin, Natalia Gimelshein, Luca Antiga, et al. PyTorch: search Assistant at the College of Engineering and
Computer Science, VinUniversity, Hanoi, Vietnam.
An Imperative Style, High-Performance Deep Learning His research interests include Federated Learning and
Library. Advances in Neural Information Processing AI for Biomedical Applications.
Systems, 32, 2019.
[25] Daniel J Beutel, Taner Topal, Akhil Mathur, Xinchi
Qiu, Javier Fernandez-Marques, Yan Gao, Lorenzo Sani,
IEEE INTERNET OF THINGS JOURNAL, VOL. X, NO. X, X 2023 13